This course develops the algebraic theory of symmetric functions, beginning with symmetric polynomials in finitely many variables and then passing to the stable ring that captures their behavior as the number of variables grows. The central goal is to understand the structure of this ring, its standard bases, and the algebraic operations and identities that organize the subject. Along the way, the course connects formal power series, representation-theoretic ideas, and enumerative combinatorics in a single framework.
The early chapters build the foundational language: symmetric polynomials, classical bases, inner products, involutions, and duality. From there, the course moves to Schur functions via Young tableaux and determinantal formulae, then studies the Cauchy identities and kernel methods that explain why these functions are so central. Later chapters extend these tools to Pieri rules, skew Schur functions, and the Littlewood-Richardson rule, which together govern multiplication and decomposition phenomena.
The final part of the course explores specializations, generating operators, and transition coefficients, using the theory to derive enumerative consequences and positivity results. The chapters are arranged so that each new idea refines or generalizes the previous ones: basic algebraic structure leads to explicit bases, which lead to identities, rules, and combinatorial formulas, and ultimately to a deeper understanding of how symmetric functions encode rich and pervasive positivity phenomena.
# Introduction
This course develops symmetric functions as a common language for enumeration, polynomial identities, and tableau combinatorics. The guiding question is how a symmetric polynomial identity in finitely many variables can be made stable as the number of variables grows, and how that stable object remembers enough structure to support explicit calculation. The main objects are the classical bases of the ring of symmetric functions, especially the monomial, elementary, complete homogeneous, power sum, and Schur bases.
The course is algebraic and combinatorial in emphasis. Representation theory explains why many of the identities are natural: Schur functions later match irreducible polynomial representations of general linear groups, the Hall inner product reflects character orthogonality, and the Littlewood-Richardson coefficients also govern tensor-product multiplicities. These notes keep the proofs inside polynomial algebra, generating functions, determinants, tableaux, and sign-reversing involutions, while pointing out where those outside interpretations are being shadowed. By the end of the course, the Littlewood-Richardson coefficients will appear as structure constants for Schur functions and as counts of tableaux satisfying a precise word condition.
## What Problem Symmetric Functions Solve
A polynomial such as
\begin{align*}
x_1x_2+x_1x_3+x_2x_3
\end{align*}
is unchanged when the variables are permuted. This symmetry appears in many enumeration problems, but the number of variables often has no intrinsic meaning: it may be a temporary bound on the size of an alphabet, the number of possible colours, or the number of roots of an auxiliary polynomial. The first problem is to separate the stable algebraic content from the finite-variable presentation.
Throughout the finite-variable discussion, $k[x_1,\dots,x_n]$ denotes the polynomial ring in the variables $x_1,\dots,x_n$ with coefficients in the field or commutative ring $k$. The symmetric group $S_n$ acts on this ring by permuting the variables, and $k[x_1,\dots,x_n]^{S_n}$ denotes the subring of polynomials fixed by every such permutation.
[definition: Symmetric Polynomial]
Let $k$ be a field and let $n \in \mathbb N$. A polynomial $f \in k[x_1, \dots, x_n]$ is symmetric if
\begin{align*}
f(x_{\sigma(1)}, \dots, x_{\sigma(n)}) = f(x_1, \dots, x_n)
\end{align*}
for every $\sigma \in S_n$.
[/definition]
This definition is finite-variable, but the course quickly asks which statements survive when $n$ changes. The elementary symmetric polynomials give the first answer, so they must be named before the structure theorem can be stated.
[definition: Elementary Symmetric Polynomial]
Let $k$ be a field and let $n \in \mathbb N$. For $0 \le r \le n$, the $r$th elementary symmetric polynomial in $k[x_1, \dots, x_n]$ is
\begin{align*}
e_r(x_1, \dots, x_n)=\\sum_{1 \le i_1<\cdots<i_r \le n}x_{i_1}\cdots x_{i_r},
\end{align*}
with $e_0=1$.
[/definition]
These polynomials encode square-free choices of variables. To turn symmetry into a workable algebraic condition, we need to know whether every symmetric polynomial can be reconstructed from these square-free coordinate functions. The finite-variable theorem says that they are not merely examples of symmetric polynomials: they are coordinates for all symmetric polynomials in $n$ variables.
[quotetheorem:5179]
[citeproof:5179]
The hypothesis that $f$ be symmetric is essential: $x_1$ cannot be expressed as a polynomial in $e_1, \dots, e_n$ when $n>1$, because every such expression is fixed by all permutations of the variables. The theorem also does not say that the same finite list of generators works uniformly for all $n$; the generator $e_n$ disappears after setting $x_n=0$. This limitation is the reason the course passes from symmetric polynomials in a fixed number of variables to the stable ring of symmetric functions.
[example: Elementary Symmetric Polynomials In Three Variables]
In $k[x_1,x_2,x_3]$, the elementary symmetric polynomials are
\begin{align*}
e_1 &= x_1+x_2+x_3,\\
e_2 &= x_1x_2+x_1x_3+x_2x_3,\\
e_3 &= x_1x_2x_3.
\end{align*}
The symmetric polynomial $x_1^2+x_2^2+x_3^2$ is not one of these three elementary generators, but it can be written in terms of them. Expanding $e_1^2$ gives
\begin{align*}
e_1^2
&=(x_1+x_2+x_3)^2\\
&=x_1^2+x_1x_2+x_1x_3+x_2x_1+x_2^2+x_2x_3+x_3x_1+x_3x_2+x_3^2\\
&=x_1^2+x_2^2+x_3^2+2x_1x_2+2x_1x_3+2x_2x_3\\
&=x_1^2+x_2^2+x_3^2+2e_2.
\end{align*}
Subtracting $2e_2$ from both sides gives
\begin{align*}
e_1^2-2e_2
&=x_1^2+x_2^2+x_3^2+2e_2-2e_2\\
&=x_1^2+x_2^2+x_3^2.
\end{align*}
Thus
\begin{align*}
x_1^2+x_2^2+x_3^2=e_1^2-2e_2.
\end{align*}
This example shows how a symmetric polynomial can be converted into coordinates built from the elementary symmetric polynomials.
[/example]
## The Stable Ring And Its Bases
Once the number of variables is no longer fixed, individual polynomial expressions must be organised by degree. A homogeneous symmetric polynomial of degree $d$ only involves monomials of total degree $d$, so for $n \ge d$ there are enough variables to represent all partition shapes of degree $d$. This is why partitions index the natural bases of the stable ring.
[definition: Partition]
A partition of $d \in \mathbb N$ is a finite weakly decreasing sequence $\lambda=(\\lambda_1, \dots, \\lambda_r)$ of positive integers such that
\begin{align*}
|\lambda| := \\lambda_1+\cdots+\\lambda_r = d.
\end{align*}
The number $r$ is the length of $\lambda$, denoted $\\ell(\lambda)$.
[/definition]
Partitions record exponent patterns after variables have been sorted. They also serve as Young diagram shapes, which later become the shapes of tableaux and skew tableaux.
[example: Monomial Symmetric Function]
For the partition $\lambda=(2,1)$, the exponent pattern means that one variable has exponent $2$, one different variable has exponent $1$, and the remaining variable has exponent $0$. In three variables, the ordered choices of the squared variable and the linear variable are
\begin{align*}
(1,2),\ (1,3),\ (2,1),\ (2,3),\ (3,1),\ (3,2),
\end{align*}
so the corresponding monomials are
\begin{align*}
x_1^2x_2,\ x_1^2x_3,\ x_2^2x_1,\ x_2^2x_3,\ x_3^2x_1,\ x_3^2x_2.
\end{align*}
Thus
\begin{align*}
m_{(2,1)}(x_1,x_2,x_3)
&=x_1^2x_2+x_1^2x_3+x_2^2x_1+x_2^2x_3+x_3^2x_1+x_3^2x_2.
\end{align*}
Each term has total degree $2+1=3$, and every term is obtained from $x_1^2x_2$ by renaming the variables while preserving the exponent pattern $(2,1,0)$.
In the stable ring, $m_{(2,1)}$ is the compatible family whose specialization to $n$ variables is
\begin{align*}
m_{(2,1)}(x_1,\dots,x_n)
=\sum_{\substack{1\le i,j\le n\\ i\ne j}}x_i^2x_j.
\end{align*}
Setting $x_n=0$ removes exactly the summands with $i=n$ or $j=n$, leaving
\begin{align*}
\sum_{\substack{1\le i,j\le n-1\\ i\ne j}}x_i^2x_j
=
m_{(2,1)}(x_1,\dots,x_{n-1}),
\end{align*}
so these finite-variable polynomials fit together under the stable specialization maps. The construction records the exponent pattern $(2,1)$ and treats the variable names as interchangeable.
[/example]
The example suggests that stable symmetric objects should be recorded by their partition-indexed monomial expansions rather than by a fixed list of variables. To add and multiply such objects, we need a single graded algebra whose degree-$d$ part is spanned by all partition shapes of size $d$.
[definition: Ring Of Symmetric Functions]
The ring of symmetric functions over $k$, denoted $\operatorname{Sym}_k$, is the graded inverse-limit algebra
\begin{align*}
\operatorname{Sym}_k=\\varprojlim_n k[x_1,\dots,x_n]^{S_n}
\end{align*}
with respect to the homomorphisms
\begin{align*}
\\rho_n:k[x_1,\dots,x_n]^{S_n}\longrightarrow k[x_1,\dots,x_{n-1}]^{S_{n-1}},\qquad
\\rho_n(f)=f(x_1,\dots,x_{n-1},0),
\end{align*}
restricted degree by degree.
The degree-$d$ component consists of compatible families $(f_n)_{n \ge 1}$ with $f_n \in k[x_1,\dots,x_n]^{S_n}$ homogeneous of degree $d$ and $\\rho_n(f_n)=f_{n-1}$. The full ring is the direct sum of these degreewise inverse limits.
[/definition]
The monomial symmetric functions $m_\lambda$ give the degree-$d$ part a basis indexed by partitions $\lambda$ with $|\lambda|=d$, and multiplication is induced from multiplication of symmetric polynomials after choosing enough variables for the relevant degree. After this definition, identities in infinitely many variables are read degree by degree. This avoids convergence questions: $H(t)E(-t)=1$, Cauchy identities, and Schur expansions are formal identities in graded or completed rings, not assertions about numerical convergence after substituting complex values for all variables.
## Generating Functions As Algebraic Devices
Many symmetric function identities are too large to write coefficient by coefficient. The course therefore uses generating functions not as analytic objects, but as compact algebraic encodings of infinitely many homogeneous identities.
The elementary functions count square-free choices of variables. Their complementary family counts choices with repetition, which is the stable version of complete homogeneous polynomials in finitely many variables.
[definition: Complete Homogeneous Symmetric Function]
For $r \ge 0$, the $r$th complete homogeneous symmetric function $h_r \in \operatorname{Sym}_k$ is
\begin{align*}
h_r=\sum_{i_1 \le \cdots \le i_r} x_{i_1}\cdots x_{i_r},
\end{align*}
with $h_0=1$.
[/definition]
More explicitly, the formula means the compatible family whose specialization to the alphabet $\{x_1,\dots,x_n\}$ is
\begin{align*}
h_r(x_1,\dots,x_n)=\sum_{1 \le i_1 \le \cdots \le i_r \le n}x_{i_1}\cdots x_{i_r}.
\end{align*}
This finite-alphabet version is a polynomial, and the stable element $h_r$ is obtained by requiring compatibility under $x_n=0$. The definition gives a second coordinate system for symmetric functions, complementary to the square-free elementary functions, but coefficient-by-coefficient comparisons between $e_r$ and $h_r$ soon become unwieldy. To make their relationship usable in calculations, the course packages both families into formal generating series; multiplication of those series then records all recurrence relations between the two bases at once.
[definition: Elementary And Complete Generating Series]
In $\operatorname{Sym}_k[[t]]$, define
\begin{align*}
E(t) &= \\sum_{r \ge 0} e_r t^r, & H(t) &= \\sum_{r \ge 0} h_r t^r,
\end{align*}
where $e_0=h_0=1$.
[/definition]
The two series encode complementary counting problems: $e_r$ chooses $r$ distinct variables, while $h_r$ chooses $r$ variables with repetition. The next identity is needed because it turns that complementarity into explicit recurrences and change-of-basis formulas.
[quotetheorem:5180]
[citeproof:5180]
The identity is a stable-ring statement, not merely a mnemonic from finite alphabets. The ambient ring matters here: the expression $H(t)E(-t)$ belongs to $\operatorname{Sym}_k[[t]]$ because the coefficient of each $t^r$ is a finite sum $\sum_{i=0}^r(-1)^i h_{r-i}e_i$ inside $\operatorname{Sym}_k$. Thus the theorem can be used coefficient by coefficient, giving recurrences such as $h_2=e_1h_1-e_2$ and, in higher degrees, systematic translations between complete and elementary coordinates.
The formal-power-series convention is not cosmetic. A naive infinite product such as $\prod_{i\ge 1}(1-x_i t)^{-1}$ does not by itself define an element of a polynomial ring, and after a substitution such as $x_i=1$ for all $i$, even the coefficient of $t$ becomes an infinite numerical sum. The degreewise stable setting avoids this failure because each coefficient is a symmetric function specified by compatible finite-alphabet specializations, not a number obtained by summing over a chosen infinite sequence of values. This distinction is the same one needed later for Cauchy kernels and determinant formulas.
[example: First Coefficients Of The Inverse Identity]
Write
\begin{align*}
H(t)&=1+h_1t+h_2t^2+h_3t^3+\cdots,\\
E(-t)&=1-e_1t+e_2t^2-e_3t^3+\cdots .
\end{align*}
Multiplying and keeping terms through degree $3$ in $t$ gives
\begin{align*}
H(t)E(-t)
&=(1+h_1t+h_2t^2+h_3t^3+\cdots)(1-e_1t+e_2t^2-e_3t^3+\cdots)\\
&=1+(h_1-e_1)t+(h_2-h_1e_1+e_2)t^2\\
&\qquad +(h_3-h_2e_1+h_1e_2-e_3)t^3+\cdots .
\end{align*}
Since $H(t)E(-t)=1$, the coefficients of $t,t^2,t^3$ must be $0$, so
\begin{align*}
h_1-e_1&=0,\\
h_2-h_1e_1+e_2&=0,\\
h_3-h_2e_1+h_1e_2-e_3&=0.
\end{align*}
The first equation gives
\begin{align*}
h_1=e_1.
\end{align*}
Using this in the second equation gives
\begin{align*}
h_2
&=h_1e_1-e_2\\
&=e_1e_1-e_2\\
&=e_1^2-e_2.
\end{align*}
Using $h_1=e_1$ and $h_2=e_1^2-e_2$ in the third equation gives
\begin{align*}
h_3
&=h_2e_1-h_1e_2+e_3\\
&=(e_1^2-e_2)e_1-e_1e_2+e_3\\
&=e_1^3-e_2e_1-e_1e_2+e_3\\
&=e_1^3-2e_1e_2+e_3,
\end{align*}
where the last line uses commutativity in $\operatorname{Sym}_k$. Thus the single formal identity recursively determines the complete functions from the elementary functions degree by degree.
[/example]
## Schur Functions And Tableaux
The second half of the course asks why Schur functions deserve special status among all symmetric functions. Their definitions come from determinants and tableaux, but their importance comes from the fact that they interact well with multiplication and with the Hall inner product.
[definition: Semistandard Young Tableau]
Let $\lambda$ be a partition. A semistandard Young tableau of shape $\lambda$ is a filling of the Young diagram of $\lambda$ by positive integers such that rows are weakly increasing from left to right and columns are strictly increasing from top to bottom.
[/definition]
The tableau definition turns a symmetric function into a weighted enumeration problem. If $T$ is a semistandard Young tableau, its weight is the sequence $\operatorname{wt}(T)=(\alpha_1,\alpha_2,\dots)$, where $\alpha_i$ is the number of entries equal to $i$ in $T$, and
\begin{align*}
x^{\operatorname{wt}(T)}=x_1^{\alpha_1}x_2^{\alpha_2}\cdots .
\end{align*}
The determinant definition, introduced later through the Jacobi-Trudi identity, turns the same object into a structured polynomial expression in the $h_r$.
[definition: Schur Function]
For a partition $\lambda$, the Schur function $s_\lambda$ is the symmetric function
\begin{align*}
s_\lambda = \\sum_T x^{\operatorname{wt}(T)},
\end{align*}
where the sum is over all semistandard Young tableaux $T$ of shape $\lambda$.
[/definition]
Although the tableau sum ranges over entries in all positive integers, it defines a stable symmetric function rather than an analytic series. In the finite alphabet $\{1,\dots,n\}$, the specialization is the finite polynomial
\begin{align*}
s_\lambda(x_1,\dots,x_n)=\sum_{\substack{T\ \text{semistandard of shape }\lambda\\ \text{with entries in }\{1,\dots,n\}}}x^{\operatorname{wt}(T)}.
\end{align*}
These specializations are compatible under $x_n=0$, because tableaux using the entry $n$ are exactly the terms that disappear. Thus the definition is compatible with the completed or degreewise stable interpretation used for the complete homogeneous functions.
This definition is combinatorial, while many of the main proofs require algebraic control. The course proves that the tableau definition agrees with determinantal formulas and then uses this agreement to derive basis and product results.
[quotetheorem:5181]
[citeproof:5181]
This theorem is a bridge between two languages: determinants support algebraic manipulation, while tableaux support enumeration. The length bound fixes the size of the determinant; adding extra zero parts to $\lambda$ gives an equivalent larger determinant, but the conventions $h_0=1$ and $h_m=0$ for $m<0$ are needed for this stability and for boundary terms in the cancellation argument. The identity gives an expression for $s_\lambda$ in the complete basis; by itself it does not prove positivity of Schur product coefficients or describe the tableaux that appear in products.
The conventions also prevent false boundary terms. For example, if $\lambda=(1,1)$ and $r=2$, Jacobi-Trudi gives
\begin{align*}
s_{(1,1)}=\det\begin{pmatrix}h_1&h_2\\ h_0&h_1\end{pmatrix}=h_1^2-h_2.
\end{align*}
If $h_0$ were not set equal to $1$, the second-row contribution would be undefined or would lose the subtraction term needed to obtain $e_2$. If negative-index terms were allowed to behave like ordinary new symbols instead of being set to $0$, then padded determinants would acquire spurious entries below the boundary of the partition. Later chapters use this bridge to prove the Cauchy identity and to compute products of Schur functions.
[example: The Schur Function $s_{(2)}$]
A semistandard tableau of shape $(2)$ has one row with two boxes. If its entries are $i$ in the left box and $j$ in the right box, the row condition in the definition of semistandard tableau is exactly $i \le j$, and there is no column condition because the diagram has only one row. Its weight monomial is therefore $x_i x_j$, so the tableau definition of the Schur function gives
\begin{align*}
s_{(2)}
&=\sum_{\substack{i,j\ge 1\\ i\le j}} x_i x_j.
\end{align*}
By the definition of the complete homogeneous symmetric function with $r=2$,
\begin{align*}
h_2
&=\sum_{i_1\le i_2}x_{i_1}x_{i_2}.
\end{align*}
Renaming $i_1=i$ and $i_2=j$ shows that the two sums have the same indexing set and the same summand for each index pair:
\begin{align*}
s_{(2)}
&=\sum_{\substack{i,j\ge 1\\ i\le j}} x_i x_j\\
&=\sum_{i_1\le i_2}x_{i_1}x_{i_2}\\
&=h_2.
\end{align*}
The determinant formula gives the same result: for $\lambda=(2)$ and $r=1$, the Jacobi-Trudi matrix is
\begin{align*}
\bigl(h_{\lambda_1-1+1}\bigr)=\bigl(h_2\bigr),
\end{align*}
and the determinant of a $1\times 1$ matrix is its only entry, so
\begin{align*}
s_{(2)}=\det\bigl(h_2\bigr)=h_2.
\end{align*}
Thus the two-box row Schur function is exactly the complete homogeneous symmetric function of degree $2$.
[/example]
## Product Rules And The Aim Of The Course
The endpoint of the course is the multiplication problem for Schur functions. Before that question has a well-defined answer, the Schur functions must be known to give coordinates on the whole ring.
[quotetheorem:5182]
[citeproof:5182]
This theorem is the algebraic reason that products of Schur functions have unique Schur expansions. It does not yet say that the coefficients in those expansions are non-negative, nor does it give a combinatorial formula for computing them. Basis existence gives coordinates; the later Pieri and Littlewood-Richardson rules explain why those coordinates count tableaux rather than arising as opaque scalars. It also depends on the stable ring and on working degree by degree: in a fixed set of $n$ variables, only partitions with length at most $n$ survive, while $\operatorname{Sym}_k$ retains every partition shape in its appropriate degree.
The stable hypothesis cannot be dropped without changing the indexing set. In one variable, $s_{(1,1)}(x_1)=0$ because no column-strict filling of a two-box column uses only the entry $1$, so the full partition-indexed Schur family is not a basis of $k[x_1]^{S_1}$. In two variables, all shapes of length greater than $2$ similarly vanish. The degreewise stable ring is therefore the setting in which every partition survives, and this is what allows product coefficients to be indexed uniformly before any finite-variable specialization is chosen.
With the basis property in place, every product $s_\lambda s_\mu$ has a unique expansion in the Schur basis, and the coefficients are the Littlewood-Richardson coefficients. These numbers are structure constants: they measure how multiplication in $\operatorname{Sym}_k$ looks when Schur functions are used as coordinates. The next definition isolates those constants before the course gives a tableau rule for computing them, so the later combinatorial theorem has a precise algebraic target.
[definition: Littlewood Richardson Coefficient]
For partitions $\lambda$ and $\mu$, the Littlewood-Richardson coefficients $c_{\lambda\mu}^{\nu}$ are the scalars defined by
\begin{align*}
s_\lambda s_\mu = \\sum_\nu c_{\lambda\mu}^{\nu}s_\nu.
\end{align*}
[/definition]
The definition names the structure constants, but it does not yet make them computable. To state the counting rule, the admissible skew tableaux must be specified rather than only invoked by name.
[definition: Littlewood Richardson Tableau]
Let $\lambda \subset \nu$ be partitions, and let $\mu$ be a partition. A Littlewood-Richardson tableau of skew shape $\nu/\lambda$ and content $\mu$ is a semistandard filling of the boxes of $\nu/\lambda$ with exactly $\mu_i$ entries equal to $i$ for each $i$, such that its reading word, read along rows from right to left starting with the top row and moving downward, is a lattice word: in every initial segment of the word, the number of entries equal to $i$ is at least the number of entries equal to $i+1$ for every $i \ge 1$.
[/definition]
This condition is the extra constraint that makes the product rule selective. A semistandard skew tableau with the right content may fail the lattice-word test, and then it does not contribute to the Schur product coefficient. The main combinatorial theorem is needed because it converts coefficient extraction into this finite count.
[quotetheorem:5183]
[citeproof:5183]
Although the full proof comes near the end, the statement explains why the earlier material is organised as it is. The containment $\lambda \subset \nu$ is needed before the skew diagram $\nu/\lambda$ exists, and the size condition matches degrees in the homogeneous product $s_\lambda s_\mu$. The theorem also does not compute a coefficient from the shape alone: the content and the lattice-word condition are part of the data being counted.
Each condition rules out a concrete failure. If $\lambda=(2)$, $\mu=(1)$, and $\nu=(1,1)$, then $|\nu|\ne |\lambda|+|\mu|$, so degree considerations force $c_{\lambda\mu}^{\nu}=0$ even though the shapes themselves are legitimate partitions. If the lattice-word condition is removed, then the skew shape $(2,1)/(1)$ with content $(1,1)$ has semistandard fillings that pass the row and column tests but whose reading word begins with $2$; counting them would add tableaux that do not occur in the Schur product coefficient. Each classical basis, generating identity, and tableau construction supplies one ingredient needed to make the product rule precise and computable.
## How To Read These Notes
The main difficulty in reading the course is that the same object often has several valid descriptions. A symmetric function may be written in the elementary, complete, monomial, power-sum, or Schur basis; it may be studied in a finite alphabet or in the stable ring; and a proof may be algebraic in generating functions or combinatorial in tableaux. The chapters follow the course progression so that these translations are introduced when they are needed rather than collected in dictionary order.
We begin with symmetric polynomials and the stable ring, then build the classical bases and their transition identities. Schur functions enter through tableaux and determinants, followed by Cauchy identities, scalar products, Pieri rules, and finally the Littlewood-Richardson rule.
The proofs are meant to show reusable methods. When a result is algebraic, look for filtrations, triangularity, and generating series. When a result is combinatorial, look for weight-preserving bijections, cancellation involutions, and tableau algorithms. The course repeatedly moves between these methods because symmetric functions are designed to make that translation possible.
The opening chapter has emphasized the translation between algebraic identities, filtrations, and combinatorial models, so the next step is to make that translation concrete in the most basic setting: symmetric polynomials in finitely many variables. Chapter 1 begins by asking how to describe the invariant polynomials and how those descriptions stabilize as the number of variables grows.
# 1. Symmetric Polynomials and the Stable Ring
Symmetric functions begin with a finite problem: understand polynomials in $n$ variables that do not change when the variables are permuted. The first goal is to find generators for these invariant polynomials and to control how expressions change as the number of variables grows. This chapter moves from symmetric polynomials in $x_1, \dots, x_n$ to the stable ring $\operatorname{Sym}$, where identities are read degree by degree rather than as ordinary polynomials in a fixed finite list of variables.
## Symmetric Polynomials in Finitely Many Variables
The basic question is: if a polynomial is unchanged by every relabelling of the variables, what data determines it? The answer is that the elementary symmetric polynomials are enough, and this is the first structural theorem of the course.
[definition: Symmetric Polynomial]
Let $k$ be a commutative ring and let $n \in \mathbb N$. A polynomial $f \in k[x_1, \dots, x_n]$ is symmetric if
\begin{align*}
f(x_{\sigma(1)}, \dots, x_{\sigma(n)}) = f(x_1, \dots, x_n)
\end{align*}
for every $\sigma \in S_n$.
The ring of symmetric polynomials in $n$ variables is
\begin{align*}
\operatorname{Sym}_n(k) := k[x_1, \dots, x_n]^{S_n}.
\end{align*}
[/definition]
Thus $\operatorname{Sym}_n(k)$ is the invariant subring for the natural action of $S_n$ on the polynomial ring. To build this ring explicitly, we need symmetric polynomials that are simple enough to name and broad enough to serve as generators. The first candidates should not single out $x_1$ or impose an ordering on the variables; instead, they should collect all squarefree monomials of a fixed degree at once. The following definition names these orbit sums, which will become the basic generators in the finite and stable theories.
[definition: Elementary Symmetric Polynomial]
For $0 \le r \le n$, the $r$th elementary symmetric polynomial in $n$ variables is
\begin{align*}
e_r^{(n)} := \sum_{1 \le i_1 < \cdots < i_r \le n} x_{i_1}\cdots x_{i_r} \in k[x_1, \dots, x_n],
\end{align*}
with $e_0^{(n)} := 1$.
[/definition]
The first few cases are $e_1^{(n)} = x_1 + \cdots + x_n$, $e_2^{(n)} = \sum_{i<j}x_ix_j$, and $e_n^{(n)} = x_1\cdots x_n$. These polynomials appear as the coefficients of the monic polynomial whose roots are $x_1, \dots, x_n$:
\begin{align*}
\prod_{i=1}^n (t+x_i) = \sum_{r=0}^n e_r^{(n)} t^{n-r}.
\end{align*}
The coefficient formula explains why elementary symmetric polynomials are natural, but it does not yet say that they generate every symmetric polynomial. A small computation gives evidence for the theorem that follows.
[example: Elementary Symmetric Polynomials In Three Variables]
In $k[x_1,x_2,x_3]$, the elementary symmetric polynomials are
\begin{align*}
e_1^{(3)} &= x_1+x_2+x_3,\\
e_2^{(3)} &= x_1x_2+x_1x_3+x_2x_3,\\
e_3^{(3)} &= x_1x_2x_3.
\end{align*}
They occur as the coefficients of the polynomial with formal variable $t$:
\begin{align*}
(t+x_1)(t+x_2)(t+x_3)
&= (t^2+(x_1+x_2)t+x_1x_2)(t+x_3)\\
&= t^3+(x_1+x_2)t^2+x_1x_2t+x_3t^2+x_3(x_1+x_2)t+x_1x_2x_3\\
&= t^3+(x_1+x_2+x_3)t^2+(x_1x_2+x_1x_3+x_2x_3)t+x_1x_2x_3\\
&= t^3+e_1^{(3)}t^2+e_2^{(3)}t+e_3^{(3)}.
\end{align*}
For a degree-three example, multiply the first two elementary symmetric polynomials:
\begin{align*}
e_1^{(3)}e_2^{(3)}
&= (x_1+x_2+x_3)(x_1x_2+x_1x_3+x_2x_3)\\
&= x_1(x_1x_2+x_1x_3+x_2x_3)
+x_2(x_1x_2+x_1x_3+x_2x_3)\\
&\qquad +x_3(x_1x_2+x_1x_3+x_2x_3)\\
&= x_1^2x_2+x_1^2x_3+x_1x_2x_3
+x_1x_2^2+x_1x_2x_3+x_2^2x_3\\
&\qquad +x_1x_2x_3+x_1x_3^2+x_2x_3^2\\
&= x_1^2x_2+x_1^2x_3+x_1x_2^2+x_2^2x_3+x_1x_3^2+x_2x_3^2+3x_1x_2x_3.
\end{align*}
Thus $e_1^{(3)}e_2^{(3)}$ contains exactly the six monomials $x_i^2x_j$ with $i\ne j$, and the remaining contribution is $3e_3^{(3)}$.
[/example]
The example shows a rewriting in one degree, but a few identities do not rule out two different polynomials in the symbols $e_1^{(n)},\dots,e_n^{(n)}$ giving the same symmetric polynomial. To use the $e_r^{(n)}$ as coordinates on $\operatorname{Sym}_n(k)$, one needs both existence of such expressions and uniqueness after the number of variables has been fixed.
[quotetheorem:5179]
[citeproof:5179]
The theorem turns questions about symmetric polynomials into questions about an ordinary polynomial ring, but each hypothesis marks a boundary. The number of variables must be fixed: in $\operatorname{Sym}_2(k)$ there is no independent generator $e_3^{(2)}$, while in $\operatorname{Sym}_3(k)$ the element $e_3^{(3)}=x_1x_2x_3$ is needed to describe products involving all three variables. The list of generators also stops at $e_n^{(n)}$, because $e_{n+1}^{(n)}$ is the empty sum and hence equals $0$; adjoining a symbol $e_{n+1}^{(n)}$ would introduce a false coordinate. Finally, the polynomial must lie in the symmetric subring: $x_1\in k[x_1,x_2]$ cannot be written as a polynomial in $e_1^{(2)}$ and $e_2^{(2)}$, since every such polynomial is fixed by swapping $x_1$ and $x_2$. Thus the theorem gives coordinates only after $n$ has been fixed and after symmetry has been imposed; it does not by itself explain how formulas should be compared as $n$ varies. In practice, the leading-monomial algorithm is often replaced by identities that recognise common symmetric sums.
[example: Rewriting A Degree Three Symmetric Sum]
Let
\begin{align*}
f=\sum_{1\le i<j\le n}(x_i^2x_j+x_ix_j^2),
\end{align*}
so $f$ is the monomial symmetric polynomial of shape $(2,1)$ in $n$ variables. We compute $e_1^{(n)}e_2^{(n)}$ by separating the terms according to whether the index coming from $e_1^{(n)}$ is already present in the pair coming from $e_2^{(n)}$:
\begin{align*}
e_1^{(n)}e_2^{(n)}
&=\left(\sum_{a=1}^n x_a\right)\left(\sum_{1\le b<c\le n}x_bx_c\right)\\
&=\sum_{a=1}^n\sum_{1\le b<c\le n}x_ax_bx_c\\
&=\sum_{1\le b<c\le n}x_b^2x_c+\sum_{1\le b<c\le n}x_bx_c^2
+\sum_{\substack{1\le a\le n\\ 1\le b<c\le n\\ a\ne b,\ a\ne c}}x_ax_bx_c.
\end{align*}
The first two sums are exactly $f$. For the remaining sum, fix three distinct indices $i<j<k$. The monomial $x_ix_jx_k$ occurs once from
\begin{align*}
(a,b,c)=(i,j,k),
\end{align*}
once from
\begin{align*}
(a,b,c)=(j,i,k),
\end{align*}
and once from
\begin{align*}
(a,b,c)=(k,i,j).
\end{align*}
These are the only possibilities, because $b<c$ and $\{a,b,c\}=\{i,j,k\}$. Hence
\begin{align*}
\sum_{\substack{1\le a\le n\\ 1\le b<c\le n\\ a\ne b,\ a\ne c}}x_ax_bx_c
&=3\sum_{1\le i<j<k\le n}x_ix_jx_k\\
&=3e_3^{(n)}.
\end{align*}
Therefore
\begin{align*}
e_1^{(n)}e_2^{(n)}=f+3e_3^{(n)},
\end{align*}
and subtracting $3e_3^{(n)}$ gives
\begin{align*}
f=e_1^{(n)}e_2^{(n)}-3e_3^{(n)}.
\end{align*}
For $n=2$, the sum over triples $i<j<k$ is empty, so the $e_3$ term disappears after two-variable specialisation. In the stable notation used later, the same identity is written
\begin{align*}
m_{(2,1)}=e_1e_2-3e_3.
\end{align*}
[/example]
This calculation already hints at the stable viewpoint. The formula is independent of $n$ once all elementary functions are allowed, but finite rings force $e_r^{(n)}=0$ for $r>n$.
## Partitions And Young Diagrams
The next problem is how to label symmetric sums systematically. Since a symmetric polynomial contains all permutations of a monomial together, the exponents should be recorded without regard to order; this leads to partitions.
[definition: Partition]
A partition is a finite weakly decreasing sequence
\begin{align*}
\lambda = (\lambda_1, \lambda_2, \dots, \lambda_\ell)
\end{align*}
of positive integers. Its degree is $|\lambda| := \lambda_1 + \cdots + \lambda_\ell$, and its length is $\ell(\lambda):=\ell$.
[/definition]
Partitions record exponent patterns algebraically. To see operations on these patterns, especially conjugation and later tableau constructions, we need a geometric encoding; this motivates Young diagrams.
[definition: Young Diagram And Conjugate Partition]
The Young diagram of a partition $\lambda$ is the left-justified array with $\lambda_i$ boxes in row $i$. The conjugate partition $\lambda'$ is defined by
\begin{align*}
\lambda'_j := |\{i : \lambda_i \ge j\}|.
\end{align*}
[/definition]
[illustration:young-diagram-conjugation-421]
Conjugation reflects the Young diagram across its main diagonal. For example, $(4,2,1)'=(3,2,1,1)$, since the column lengths of a diagram with rows $4,2,1$ are $3,2,1,1$. The same partitions also index the orbit sums of monomials, so we next turn the combinatorial labels into symmetric polynomials.
[example: Partitions Of Four]
The partitions of $4$ are
\begin{align*}
(4),\quad (3,1),\quad (2,2),\quad (2,1,1),\quad (1,1,1,1).
\end{align*}
To see that this list is complete, write a partition of $4$ as $\lambda=(\lambda_1,\dots,\lambda_\ell)$ with $\lambda_1\ge \cdots \ge \lambda_\ell>0$ and $\lambda_1+\cdots+\lambda_\ell=4$. If $\lambda_1=4$, then the remaining sum is $0$, giving $(4)$. If $\lambda_1=3$, then the remaining sum is $1$, giving $(3,1)$. If $\lambda_1=2$, then the remaining sum is $2$, and the weakly decreasing possibilities are $(2,2)$ and $(2,1,1)$. If $\lambda_1=1$, then every part is $1$, giving $(1,1,1,1)$.
These partitions label the possible exponent patterns of monomials of total degree $4$: for example, $(3,1)$ corresponds to monomials $x_i^3x_j$ with $i\ne j$, while $(2,2)$ corresponds to monomials $x_i^2x_j^2$ with $i\ne j$. In three variables, the shape $(1,1,1,1)$ would require four distinct variables with exponent $1$, but only $x_1,x_2,x_3$ are available, so $m_{(1,1,1,1)}^{(3)}=0$.
[/example]
This last observation is the first place where the number of variables matters. To compare different $n$, we need symmetric polynomials whose labels do not change as long as enough variables are present.
[definition: Monomial Symmetric Polynomial]
Let $\lambda$ be a partition with $\ell(\lambda) \le n$. The monomial symmetric polynomial $m_\lambda^{(n)} \in k[x_1,\dots,x_n]$ is the sum of all distinct monomials obtained by permuting the exponent sequence
\begin{align*}
(\lambda_1,\dots,\lambda_{\ell(\lambda)},0,\dots,0).
\end{align*}
If $\ell(\lambda)>n$, set $m_\lambda^{(n)}:=0$.
[/definition]
The polynomials $m_\lambda^{(n)}$ organise symmetric polynomials by degree and by exponent pattern. The remaining issue is linear algebra: different exponent patterns might a priori satisfy hidden relations, or fail to span all symmetric homogeneous polynomials of a fixed degree. The point of introducing these orbit sums is that they eliminate both problems degree by degree.
[quotetheorem:5184]
[citeproof:5184]
The basis theorem is the bridge from finite variable algebra to the stable ring, and it should not be read as a basis statement for every partition in every finite ring. The length condition $\ell(\lambda)\le n$ excludes shapes that cannot be realised in $n$ variables: $m_{(1,1,1)}^{(2)}=0$, so including it in a two-variable basis would add the zero vector. More generally, $m_{(1^r)}^{(n)}=0$ whenever $r>n$, while $m_{(1^r)}^{(r)}=x_1\cdots x_r$ is nonzero. The homogeneity condition is also essential. If all degrees are pooled without grading, the finite-degree basis vectors still span the whole symmetric ring as a direct sum, but there is no single finite basis and no fixed degree in which to compare coefficients. What the theorem supplies for later use is degree-wise coordinates: in each fixed degree there are finitely many shapes, and those shapes eventually fit inside every sufficiently large set of variables. It does not say that finite specialisation preserves all stable information or that length restrictions can be ignored.
## The Stable Ring Of Symmetric Functions
The finite rings $\operatorname{Sym}_n(k)$ do not form a single increasing chain in the naive sense, since a polynomial in $n+1$ variables can be specialised by setting $x_{n+1}=0$. The stable ring records exactly those compatible families whose degree-$d$ behaviour has stabilised once $n\ge d$.
[definition: Specialisation Maps]
For $n\ge 1$, define
\begin{align*}
\rho_{n+1,n}: \operatorname{Sym}_{n+1}(k) &\longrightarrow \operatorname{Sym}_n(k), & f(x_1,\dots,x_{n+1}) &\longmapsto f(x_1,\dots,x_n,0).
\end{align*}
[/definition]
These maps send $e_r^{(n+1)}$ to $e_r^{(n)}$ for $r\le n$ and send $e_{n+1}^{(n+1)}$ to $0$. Since they provide the comparison maps between all finite variable rings, they motivate defining a single object as the compatible limit.
[definition: Ring Of Symmetric Functions]
The ring of symmetric functions over $k$ is the graded inverse limit
\begin{align*}
\operatorname{Sym}(k) := \varprojlim_n \operatorname{Sym}_n(k)
\end{align*}
with respect to the maps $\rho_{n+1,n}$, taken in the category of graded $k$-algebras.
[/definition]
This definition means that a homogeneous symmetric function of degree $d$ is represented by a compatible sequence $(f_n)_{n\ge 1}$ with $f_n\in \operatorname{Sym}_n(k)$ homogeneous of degree $d$. Since $m_\lambda^{(n)}$ is nonzero for all $n\ge \ell(\lambda)$, it defines a stable element $m_\lambda\in \operatorname{Sym}(k)$.
[example: Comparing Sym Three And The Stable Ring]
In three variables, the fourth elementary symmetric polynomial is the empty sum:
\begin{align*}
e_4^{(3)}
&=\sum_{1\le i_1<i_2<i_3<i_4\le 3}x_{i_1}x_{i_2}x_{i_3}x_{i_4}\\
&=0,
\end{align*}
because there is no strictly increasing four-tuple of indices chosen from $\{1,2,3\}$. Hence any expression in $\operatorname{Sym}_3(k)$ can involve only the elementary generators $e_1^{(3)},e_2^{(3)},e_3^{(3)}$, with $e_4^{(3)}$ contributing nothing.
In the stable ring, the element $e_4$ is represented degree by degree by the compatible family whose $n$-variable component is
\begin{align*}
e_4^{(n)}=\sum_{1\le i_1<i_2<i_3<i_4\le n}x_{i_1}x_{i_2}x_{i_3}x_{i_4}.
\end{align*}
For $n=4$ this gives
\begin{align*}
e_4^{(4)}
&=\sum_{1\le i_1<i_2<i_3<i_4\le 4}x_{i_1}x_{i_2}x_{i_3}x_{i_4}\\
&=x_1x_2x_3x_4,
\end{align*}
so the stable element $e_4$ is not the zero symmetric function. Its three-variable specialisation is still zero, since
\begin{align*}
\rho_{4,3}(e_4^{(4)})
&=e_4^{(4)}(x_1,x_2,x_3,0)\\
&=x_1x_2x_3\cdot 0\\
&=0.
\end{align*}
Thus $e_4=0$ is false in $\operatorname{Sym}(k)$, even though the image of $e_4$ in $\operatorname{Sym}_3(k)$ vanishes.
[/example]
The comparison shows why finite specialisations can lose information: a stable function may vanish after passing to too few variables. This creates a specific danger for relations among $e_1,e_2,\dots$: a relation detected in $\operatorname{Sym}_n(k)$ might only reflect that some higher $e_r^{(n)}$ have become zero. Before using the $e_r$ as coordinates on the stable ring, we therefore need a theorem that rules out genuine polynomial relations among finitely many of them. Such a theorem must test each proposed relation only after enough variables are present for all generators involved.
[quotetheorem:5185]
[citeproof:5185]
Algebraic independence gives injectivity of the polynomial algebra generated by the $e_r$, and it is exactly where the stable viewpoint differs from any fixed finite ring. In $\operatorname{Sym}_n(k)$ the element $e_{n+1}^{(n)}$ vanishes, but the stable element $e_{n+1}$ is not zero. The theorem therefore says that every genuine polynomial relation among finitely many elementary generators is absent in the stable ring, even though low-dimensional specializations can collapse some of those generators. This is the right level of freedom for a universal symmetric-function ring. The next theorem is needed to prove that no other generators are required in the stable ring.
[quotetheorem:5186]
[citeproof:5186]
This theorem is the main reason symmetric functions are tractable. The ring has infinitely many generators, but each homogeneous degree involves only finitely many of them. The grading is not cosmetic: without it, the expression $1+e_1+e_2+\cdots$ would look like a legitimate polynomial in the generators, even though it has infinitely many nonzero homogeneous parts and is not an element of $\operatorname{Sym}(k)$. Homogeneous finiteness is also needed within each degree: a degree-$d$ element is a finite $k$-linear combination of the finitely many $m_\lambda$ with $|\lambda|=d$, so coefficient comparisons and finite-variable stabilisation make sense. In the completed ring introduced below, infinite degree-wise sums such as $1+e_1+e_2+\cdots$ are allowed, but that is a different object with product and convergence interpreted degree by degree. This distinction is essential whenever generating functions are used.
## Grading, Completion, And Infinite Identities
The final problem in the chapter is interpretive: what does it mean to write a formula involving infinitely many variables or an infinite product? The answer depends on whether the expression is an element of $\operatorname{Sym}(k)$ or of its completion.
[definition: Grading On Sym]
The grading on $\operatorname{Sym}(k)$ is
\begin{align*}
\operatorname{Sym}(k) = \bigoplus_{d\ge 0}\operatorname{Sym}^d(k),
\end{align*}
where $\operatorname{Sym}^d(k)$ is the $k$-module spanned by $m_\lambda$ with $|\lambda|=d$.
[/definition]
A genuine symmetric function is a finite sum of homogeneous pieces. Generating series, however, often collect one homogeneous piece in every degree, so they require a setting where infinite degree-wise sums are permitted; this motivates the completed ring.
[definition: Completed Ring Of Symmetric Functions]
The degree completion of $\operatorname{Sym}(k)$ is
\begin{align*}
\widehat{\operatorname{Sym}}(k) := \prod_{d\ge 0}\operatorname{Sym}^d(k).
\end{align*}
[/definition]
Elements of $\widehat{\operatorname{Sym}}(k)$ may have infinitely many nonzero homogeneous components, but each component has finite degree. This is why infinite products such as $\prod_{i\ge 1}(1+x_it)$ are meaningful as formal power series in $t$ with coefficients in $\operatorname{Sym}(k)$.
[example: The Elementary Generating Series]
For each $N\ge 1$, define the finite product
\begin{align*}
E_N(t)=\prod_{i=1}^N(1+x_it).
\end{align*}
Expanding means choosing, from each factor, either $1$ or $x_it$. If we choose $x_it$ from exactly the factors indexed by a subset $I\subseteq \{1,\dots,N\}$ and choose $1$ from the remaining factors, the resulting term is
\begin{align*}
\prod_{i\in I}x_i\, t^{|I|}.
\end{align*}
Therefore
\begin{align*}
E_N(t)
&=\sum_{I\subseteq \{1,\dots,N\}}\left(\prod_{i\in I}x_i\right)t^{|I|}\\
&=\sum_{r=0}^N\left(\sum_{\substack{I\subseteq \{1,\dots,N\}\\ |I|=r}}\prod_{i\in I}x_i\right)t^r\\
&=\sum_{r=0}^N\left(\sum_{1\le i_1<\cdots<i_r\le N}x_{i_1}\cdots x_{i_r}\right)t^r\\
&=\sum_{r=0}^N e_r^{(N)}t^r.
\end{align*}
The infinite product
\begin{align*}
E(t)=\prod_{i\ge 1}(1+x_it)
\end{align*}
is read by fixing a coefficient of $t^r$ first. In the finite product $E_N(t)$, the coefficient of $t^r$ is
\begin{align*}
e_r^{(N)}=\sum_{1\le i_1<\cdots<i_r\le N}x_{i_1}\cdots x_{i_r}.
\end{align*}
Under the specialisation from $N+1$ variables to $N$ variables, every summand of $e_r^{(N+1)}$ either avoids $x_{N+1}$ or contains it, so
\begin{align*}
e_r^{(N+1)}(x_1,\dots,x_N,0)
&=\sum_{1\le i_1<\cdots<i_r\le N}x_{i_1}\cdots x_{i_r}
+\sum_{1\le i_1<\cdots<i_{r-1}\le N}x_{i_1}\cdots x_{i_{r-1}}\cdot 0\\
&=e_r^{(N)}.
\end{align*}
Thus the coefficient of $t^r$ is the stable elementary symmetric function $e_r$, and the product is the degree-wise identity
\begin{align*}
E(t)=\sum_{r\ge 0}e_rt^r.
\end{align*}
The infinite product is therefore not an ordinary finite polynomial expansion; it is compact notation for the compatible finite products $\prod_{i=1}^N(1+x_it)$ in every degree.
[/example]
The distinction between the graded ring and its completion prevents a common mistake. Infinite sums of symmetric functions are allowed only when each homogeneous degree receives a well-defined coefficient.
[remark: Meaning Of Infinite Variable Identities]
An identity in $\operatorname{Sym}(k)$ is an equality of finite sums of homogeneous symmetric functions. An identity in $\widehat{\operatorname{Sym}}(k)$ is checked degree by degree. After specialising to $n$ variables, either kind of identity gives an identity in $k[x_1,\dots,x_n]$, but the converse must be tested uniformly in degree rather than for a single value of $n$.
[/remark]
This completes the passage from finite symmetric polynomials to the stable algebraic object used throughout the course. Later bases and product rules will be constructed inside $\operatorname{Sym}(k)$, while generating functions and Cauchy-type identities will usually be read in the completed ring.
Once the stable ring has been constructed, the natural question is how to navigate inside it efficiently. Chapter 2 answers that by introducing the classical bases of $\operatorname{Sym}$, turning the stable algebra from an abstract object into a space with workable coordinates.
# 2. Classical Bases of Symmetric Functions
This chapter develops the standard coordinate systems on the graded ring $\operatorname{Sym}$ of symmetric functions. The previous chapter constructed $\operatorname{Sym}$ as a stable version of symmetric polynomial rings, proved that the elementary symmetric functions are algebraically independent generators, and introduced the monomial basis as the reference stable coordinate system. We now compare four bases: monomial, elementary, complete homogeneous, and power-sum symmetric functions. The main questions are how these bases are defined, why they span the same homogeneous pieces, and how generating series encode the change of basis.
## Monomial Symmetric Functions
How do we name the most direct symmetric function obtained from a single monomial shape? In a finite polynomial ring, symmetrising a monomial collects all distinct permutations of its exponent sequence. Passing to the stable ring $\operatorname{Sym}$ over $\mathbb Z$ gives one function for each partition, with the finite-variable length restrictions from Chapter 1 removed by stability.
Without these orbit-sum coordinates, even a small symmetric polynomial has no canonical list of coefficients. For instance, $x_1^2x_2+x_1x_2^2+x_1^2x_3+x_1x_3^2+\cdots$ is better recorded as the single coefficient of the shape $(2,1)$ than as infinitely many equal monomial coefficients indexed by pairs of variables. The monomial functions remove this redundancy and give a reference coordinate system before any multiplicative structure is imposed.
[definition: Monomial Symmetric Function]
Let $\lambda=(\lambda_1,\dots,\lambda_\ell)$ be a partition. The monomial symmetric function $m_\lambda \in \operatorname{Sym}$ is
\begin{align*}
m_\lambda=\sum x_{i_1}^{\lambda_1}\cdots x_{i_\ell}^{\lambda_\ell},
\end{align*}
where the sum ranges over all distinct monomials obtained by assigning the nonzero parts of $\lambda$ to distinct variables.
[/definition]
The phrase "distinct monomials" matters when parts are repeated: $m_{(2,2)}$ contains $x_i^2x_j^2$ once for each unordered pair of distinct indices. As in Chapter 1, the monomial basis records orbit sums for the action of permutation groups on monomials, so it is the closest basis to the definition of symmetry.
[example: Degree Four Monomial Functions]
The partitions of $4$ are $(4)$, $(3,1)$, $(2,2)$, $(2,1,1)$, and $(1,1,1,1)$. Applying the definition of $m_\lambda$ to each exponent shape gives
\begin{align*}
m_{(4)}&=\sum_i x_i^4,\\
m_{(3,1)}&=\sum_{i\ne j} x_i^3x_j,\\
m_{(2,2)}&=\sum_{i<j}x_i^2x_j^2,\\
m_{(2,1,1)}&=\sum_{\substack{i,j,k\text{ distinct}\\ j<k}}x_i^2x_jx_k,\\
m_{(1,1,1,1)}&=\sum_{i<j<k<l}x_ix_jx_kx_l.
\end{align*}
The inequalities remove repeated listings of the same monomial when equal exponents occur: for instance, $x_i^2x_j^2=x_j^2x_i^2$, so $m_{(2,2)}$ uses $i<j$, while $x_i^2x_jx_k=x_i^2x_kx_j$, so $m_{(2,1,1)}$ fixes $j<k$ after choosing the variable carrying exponent $2$. Since every degree $4$ monomial has exactly one of these five sorted exponent patterns, these five orbit sums are the natural ordered monomial coordinates for $\operatorname{Sym}^4$.
[/example]
To prove basis statements, we need an order on partitions that measures how exponent mass is distributed. The triangular behaviour of many classical bases is most naturally expressed using dominance order.
[definition: Dominance Order]
For partitions $\lambda,\mu$ of the same integer, write $\lambda \unrhd \mu$ if
\begin{align*}
\lambda_1+\cdots+\lambda_r\ge \mu_1+\cdots+\mu_r
\end{align*}
for every $r\ge 1$, where missing parts are treated as $0$.
[/definition]
Dominance order is compatible with leading exponent shapes: a product of symmetric functions often has a largest partition in dominance order, and the remaining terms are dominated by it. This is the sense in which the monomial functions provide triangular coordinates, so the next step is to confirm that these coordinates are complete.
[quotetheorem:5184]
[citeproof:5184]
This theorem gives the reference basis against which the other classical bases will be compared. Each part of the statement is doing work. The fixed-degree hypothesis is needed because $\operatorname{Sym}$ is an infinite direct sum of homogeneous pieces: the infinite list $m_\varnothing,m_{(1)},m_{(2)},\dots$ is not a finite basis for one module of bounded degree unless the grading is specified. The stable symmetric-function hypothesis is also needed: in two variables, $m_{(1,1,1)}$ is zero rather than a degree $3$ basis element, so the finite-variable basis uses only partitions of length at most the number of variables. Finally, the integral coefficient ring matters because the theorem is a $\mathbb Z$-basis statement; after quotienting by a relation such as $2m_{(1)}=0$ in a non-free module, the same independence claim would no longer describe the resulting object.
The theorem does not claim good multiplication constants or efficient recurrences. For example, $m_{(1)}m_{(2)}=m_{(3)}+m_{(2,1)}$, and products with repeated part patterns split into increasingly many orbit sums. This limitation is the reason the chapter next introduces bases with stronger multiplicative or generating-series structure.
## Elementary and Complete Homogeneous Functions
Which symmetric functions package squarefree monomials and weakly increasing monomials? The elementary functions select distinct variables, while the complete homogeneous functions allow repetitions. These two families are dual in spirit and inverse at the level of generating series.
The distinction is forced by concrete counting problems. If a problem asks for subsets of variables, repeated choices such as $x_i^2$ should be excluded, and $e_r$ is the right package. If a problem asks for multisets or weakly increasing sequences, excluding repetitions loses terms, so $h_r$ is needed instead.
[definition: Elementary Symmetric Function]
For $r\ge 0$, the elementary symmetric function $e_r\in \operatorname{Sym}$ is defined by $e_0=1$ and, for $r\ge 1$,
\begin{align*}
e_r=\sum_{i_1<\cdots<i_r}x_{i_1}\cdots x_{i_r}.
\end{align*}
For a partition $\lambda=(\lambda_1,\dots,\lambda_\ell)$, define $e_\lambda=e_{\lambda_1}\cdots e_{\lambda_\ell}$.
[/definition]
The elementary function $e_r$ is exactly $m_{(1^r)}$. Products $e_\lambda$ are no longer monomial orbit sums, but their highest triangular contribution is controlled by conjugation of partitions.
[example: Products of Elementary Functions in Degree Four]
We expand each product by counting, for a fixed monomial shape, how many choices from the elementary factors produce that monomial. For example, in $e_{3,1}=e_3e_1$, a monomial $x_a^2x_bx_c$ with $a,b,c$ distinct can only come from choosing $x_ax_bx_c$ in $e_3$ and $x_a$ in $e_1$, so its coefficient is $1$; a squarefree monomial $x_ax_bx_cx_d$ comes from choosing the singleton variable in $4$ ways, so its coefficient is $4$. Hence
\begin{align*}
e_4&=m_{(1,1,1,1)},\\
e_{3,1}&=m_{(2,1,1)}+4m_{(1,1,1,1)}.
\end{align*}
For $e_{2,2}=e_2e_2$, a fixed monomial $x_a^2x_b^2$ comes only from
\begin{align*}
(x_ax_b)(x_ax_b),
\end{align*}
so its coefficient is $1$. A fixed monomial $x_a^2x_bx_c$ comes from
\begin{align*}
(x_ax_b)(x_ax_c),\qquad (x_ax_c)(x_ax_b),
\end{align*}
so its coefficient is $2$. A squarefree monomial $x_ax_bx_cx_d$ comes from choosing the two variables in the first $e_2$ factor, with the second factor forced to contain the remaining two variables; this gives $\binom{4}{2}=6$ choices. Therefore
\begin{align*}
e_{2,2}=m_{(2,2)}+2m_{(2,1,1)}+6m_{(1,1,1,1)}.
\end{align*}
For $e_{2,1,1}=e_2e_1e_1$, the two $e_1$ factors are ordered. A fixed monomial $x_a^3x_b$ comes only from
\begin{align*}
(x_ax_b)(x_a)(x_a),
\end{align*}
so the coefficient of $m_{(3,1)}$ is $1$. A fixed monomial $x_a^2x_b^2$ comes from
\begin{align*}
(x_ax_b)(x_a)(x_b),\qquad (x_ax_b)(x_b)(x_a),
\end{align*}
so the coefficient of $m_{(2,2)}$ is $2$. A fixed monomial $x_a^2x_bx_c$ comes from
\begin{align*}
(x_bx_c)(x_a)(x_a),\qquad
(x_ax_b)(x_a)(x_c),\qquad
(x_ax_b)(x_c)(x_a),\\
(x_ax_c)(x_a)(x_b),\qquad
(x_ax_c)(x_b)(x_a),
\end{align*}
so the coefficient of $m_{(2,1,1)}$ is $5$. Finally, a squarefree monomial $x_ax_bx_cx_d$ is obtained by choosing the two variables in the $e_2$ factor and then ordering the remaining two singleton choices, giving $\binom{4}{2}\cdot 2=12$ choices. Thus
\begin{align*}
e_{2,1,1}=m_{(3,1)}+2m_{(2,2)}+5m_{(2,1,1)}+12m_{(1,1,1,1)}.
\end{align*}
For $e_{1,1,1,1}=e_1^4$, the coefficient of a monomial with multiplicity pattern $\mu$ is the number of ordered four-letter words with those multiplicities. Hence
\begin{align*}
x_a^4 &: \frac{4!}{4!}=1,\\
x_a^3x_b &: \frac{4!}{3!1!}=4,\\
x_a^2x_b^2 &: \frac{4!}{2!2!}=6,\\
x_a^2x_bx_c &: \frac{4!}{2!1!1!}=12,\\
x_ax_bx_cx_d &: \frac{4!}{1!1!1!1!}=24.
\end{align*}
Therefore
\begin{align*}
e_{1,1,1,1}=m_{(4)}+4m_{(3,1)}+6m_{(2,2)}+12m_{(2,1,1)}+24m_{(1,1,1,1)}.
\end{align*}
Reading the first nonzero term in each row gives $m_{\lambda'}$, with coefficient $1$, so these degree $4$ expansions show the conjugate-partition triangular pattern for elementary products.
[/example]
The elementary example shows how squarefree choices interact when multiplied, but many enumerative problems allow repeated variables rather than distinct ones. This motivates the complete homogeneous functions, which sum every monomial of a fixed total degree and later provide the inverse generating series to the elementary functions.
[definition: Complete Homogeneous Symmetric Function]
For $r\ge 0$, the complete homogeneous symmetric function $h_r\in \operatorname{Sym}$ is defined by $h_0=1$ and, for $r\ge 1$,
\begin{align*}
h_r=\sum_{i_1\le \cdots\le i_r}x_{i_1}\cdots x_{i_r}.
\end{align*}
For a partition $\lambda=(\lambda_1,\dots,\lambda_\ell)$, define $h_\lambda=h_{\lambda_1}\cdots h_{\lambda_\ell}$.
[/definition]
The single-row function $h_r$ is the sum of every monomial symmetric function of degree $r$. Products $h_\lambda$ count ways of distributing several weakly increasing rows of variables into one exponent partition.
[example: Products of Complete Homogeneous Functions in Degree Four]
For a product $h_\lambda=h_{\lambda_1}\cdots h_{\lambda_\ell}$, fix one monomial $x^\alpha$ of degree $4$. Its coefficient is the number of ordered decompositions
\begin{align*}
\alpha=\beta^{(1)}+\cdots+\beta^{(\ell)},\qquad |\beta^{(s)}|=\lambda_s,
\end{align*}
because each $h_{\lambda_s}$ contains each degree-$\lambda_s$ monomial once.
For $h_4$, there is only one factor, so every degree $4$ monomial occurs once:
\begin{align*}
h_4=m_{(4)}+m_{(3,1)}+m_{(2,2)}+m_{(2,1,1)}+m_{(1,1,1,1)}.
\end{align*}
For $h_{3,1}=h_3h_1$, the $h_1$ factor chooses one variable from the support of the fixed monomial. Thus the coefficients for the shapes
\begin{align*}
(4),\quad (3,1),\quad (2,2),\quad (2,1,1),\quad (1,1,1,1)
\end{align*}
are respectively
\begin{align*}
1,\quad 2,\quad 2,\quad 3,\quad 4,
\end{align*}
since these shapes have support sizes $1,2,2,3,4$. Hence
\begin{align*}
h_{3,1}=m_{(4)}+2m_{(3,1)}+2m_{(2,2)}+3m_{(2,1,1)}+4m_{(1,1,1,1)}.
\end{align*}
For $h_{2,2}=h_2h_2$, count the possible degree $2$ contribution from the first factor; the second factor is then forced. For $x_a^4$ there is only
\begin{align*}
x_a^2\cdot x_a^2.
\end{align*}
For $x_a^3x_b$, the first factor can be
\begin{align*}
x_a^2,\qquad x_ax_b,
\end{align*}
giving $2$ choices. For $x_a^2x_b^2$, the first factor can be
\begin{align*}
x_a^2,\qquad x_ax_b,\qquad x_b^2,
\end{align*}
giving $3$ choices. For $x_a^2x_bx_c$, the first factor can be
\begin{align*}
x_a^2,\qquad x_ax_b,\qquad x_ax_c,\qquad x_bx_c,
\end{align*}
giving $4$ choices. For $x_ax_bx_cx_d$, the first factor chooses any two of the four variables, giving $\binom{4}{2}=6$ choices. Therefore
\begin{align*}
h_{2,2}=m_{(4)}+2m_{(3,1)}+3m_{(2,2)}+4m_{(2,1,1)}+6m_{(1,1,1,1)}.
\end{align*}
For $h_{2,1,1}=h_2h_1h_1$, the two $h_1$ factors are ordered. For $x_a^4$, the ordered singleton choices are only
\begin{align*}
(a,a),
\end{align*}
so the coefficient is $1$. For $x_a^3x_b$, the valid ordered singleton choices are
\begin{align*}
(a,a),\qquad (a,b),\qquad (b,a),
\end{align*}
so the coefficient is $3$. For $x_a^2x_b^2$, the valid ordered singleton choices are
\begin{align*}
(a,a),\qquad (a,b),\qquad (b,a),\qquad (b,b),
\end{align*}
so the coefficient is $4$. For $x_a^2x_bx_c$, they are
\begin{align*}
(a,a),\qquad (a,b),\qquad (b,a),\qquad (a,c),\qquad (c,a),\qquad (b,c),\qquad (c,b),
\end{align*}
so the coefficient is $7$. For $x_ax_bx_cx_d$, the two ordered singleton variables must be distinct, giving $4\cdot 3=12$ choices. Hence
\begin{align*}
h_{2,1,1}=m_{(4)}+3m_{(3,1)}+4m_{(2,2)}+7m_{(2,1,1)}+12m_{(1,1,1,1)}.
\end{align*}
Finally, $h_{1,1,1,1}=h_1^4$. A fixed monomial with multiplicity pattern $\mu$ occurs once for each ordered four-letter word with those multiplicities:
\begin{align*}
x_a^4 &: \frac{4!}{4!}=1,\\
x_a^3x_b &: \frac{4!}{3!1!}=4,\\
x_a^2x_b^2 &: \frac{4!}{2!2!}=6,\\
x_a^2x_bx_c &: \frac{4!}{2!1!1!}=12,\\
x_ax_bx_cx_d &: \frac{4!}{1!1!1!1!}=24.
\end{align*}
Thus
\begin{align*}
h_{1,1,1,1}=m_{(4)}+4m_{(3,1)}+6m_{(2,2)}+12m_{(2,1,1)}+24m_{(1,1,1,1)}.
\end{align*}
These expansions show concretely that complete homogeneous products count ordered decompositions of each exponent vector into row contributions of the prescribed sizes.
[/example]
The degree $4$ expansions are useful but unwieldy as a general method. Already in degree $4$, converting $h_{2,1,1}$ by listing exponent vectors requires several separate multiplicity counts; in degree $n$, this direct bookkeeping becomes a problem about all set partitions of row contributions. To control all degrees at once, we package the single-row functions $e_r$ and $h_r$ into formal power series whose coefficients remember the homogeneous degree.
[definition: Elementary and Complete Generating Series]
The elementary and complete generating series are
\begin{align*}
E(t)&=\sum_{r\ge 0}e_rt^r, & H(t)&=\sum_{r\ge 0}h_rt^r.
\end{align*}
[/definition]
The generating series separate the two choices made variable by variable: choose a variable at most once for $E(t)$ and choose it with arbitrary multiplicity for $H(t)$. This comparison asks for a precise identity between the two series, and the answer is an inverse relation.
[quotetheorem:5180]
[citeproof:5180]
The recurrence lets us express the complete functions recursively in elementary functions and conversely, so it raises a basis question rather than only a formal-series question. The theorem has three important hypotheses and limitations. First, the sign in $E(-t)$ is forced: replacing it by $E(t)$ changes cancellation into accumulation, and the coefficient of $t^2$ contains
\begin{align*}
h_2+e_1h_1+e_2
\end{align*}
rather than $0$; in one variable this coefficient is $x_1^2+x_1^2=2x_1^2$ plus the $e_2$ term, not a cancelling coefficient. Second, the same alphabet is required. If $H$ is formed from variables $x_i$ and $E$ from unrelated variables $y_i$, the factors $(1-x_it)^{-1}$ and $(1-y_it)$ do not cancel, so even the coefficient of $t$ gives $h_1(x)-e_1(y)$ rather than $0$. Third, the identity is a formal power-series identity in stable degree; substituting a numerical value of $t$ is not part of the statement, and in finitely many variables the coefficient comparison must be made before passing to the stable limit.
The identity says nothing by itself about independence of all products $e_\lambda$ or $h_\lambda$, since a recurrence among one-row functions does not rule out relations among products. The forward use of the theorem is therefore structural rather than final: it supplies recursive conversions between the single-row elementary and complete functions, while the next theorem upgrades those recurrences to full degreewise bases.
Since the previous chapter proved that the $e_r$ are algebraically independent generators, we now ask whether the products $e_\lambda$ and $h_\lambda$ give full homogeneous coordinate systems. This is a stronger degreewise statement than generation: it asks for exactly one coordinate vector for every partition shape, with no repetitions and no missing homogeneous elements. The examples above suggest triangular expansions, but a basis theorem is needed to turn those leading terms into spanning and independence in every degree.
[quotetheorem:5187]
[citeproof:5187]
This basis theorem explains why the inverse identity is not merely a recurrence among special elements. It relates two full coordinate systems on each homogeneous component. Its hypotheses again prevent specific failures. The statement is degreewise: if products of all degrees are mixed together, $e_{(1)}$ and $e_{(2)}$ cannot both be coordinates for a single homogeneous component, and a finite matrix comparison no longer exists. The triangular proof also depends on using partitions as indices; if arbitrary compositions were allowed, repeated products such as $e_{3,1}$ and $e_{1,3}$ would duplicate the same element and could not form a basis. In finitely many variables there is another restriction: for the elementary family, any factor $e_r$ with $r>N$ vanishes in $N$ variables, so $e_3=0$ in two variables even though its stable analogue is nonzero; for the complete family, indexing all partitions in degree $n$ gives too many products once the finite-variable homogeneous component has smaller dimension.
The theorem also separates basis status from multiplicative simplicity. For instance, $e_{2,1}$ and $h_{2,1}$ are valid coordinates, but multiplying either by another basis element still requires expansion and collection in degree. The next section introduces power sums because Newton identities give a different kind of control: they connect the one-row functions $e_r,h_r$ with the power sums through logarithmic derivatives of the same generating series.
## Power Sums and Newton Identities
What basis diagonalises the operation of taking independent variable powers? The power sums forget the distribution of parts among different variables and instead measure each exponent size separately. They are especially useful over fields of characteristic zero because integer factors appearing in Newton identities can be divided.
There is a real obstruction here, not only a change of notation. Over characteristic $p$, the Newton identity for $n=p$ contains the term $p e_p$, which vanishes after reduction modulo $p$ and cannot be used to recover $e_p$ from $p_1,\dots,p_p$. For example, in characteristic $2$ the equation for $n=2$ loses the coefficient of $e_2$, so power sums cannot be treated as an integral basis without changing the ground ring.
[definition: Power-Sum Symmetric Function]
For $r\ge 1$, the power-sum symmetric function is
\begin{align*}
p_r=\sum_i x_i^r.
\end{align*}
For a partition $\lambda=(\lambda_1,\dots,\lambda_\ell)$, define $p_\lambda=p_{\lambda_1}\cdots p_{\lambda_\ell}$.
[/definition]
The function $p_r$ is $m_{(r)}$, but products $p_\lambda$ have coefficients coming from ordered assignments of parts to variables. This produces a basis after tensoring with a characteristic-zero field.
[example: Power Sums in Degree Four]
For each product $p_\lambda$, we expand the defining sums and then collect monomials by sorted exponent pattern. Since each $m_\mu$ contains each distinct monomial of shape $\mu$ exactly once, the coefficient of $m_\mu$ is the coefficient of any fixed monomial with that shape.
First,
\begin{align*}
p_4=\sum_i x_i^4=m_{(4)}.
\end{align*}
For $p_{3,1}$,
\begin{align*}
p_{3,1}
&=\left(\sum_i x_i^3\right)\left(\sum_j x_j\right)\\
&=\sum_i x_i^4+\sum_{i\ne j}x_i^3x_j\\
&=m_{(4)}+m_{(3,1)}.
\end{align*}
For $p_{2,2}$,
\begin{align*}
p_{2,2}
&=\left(\sum_i x_i^2\right)\left(\sum_j x_j^2\right)\\
&=\sum_i x_i^4+\sum_{i\ne j}x_i^2x_j^2\\
&=\sum_i x_i^4+\sum_{i<j}\left(x_i^2x_j^2+x_j^2x_i^2\right)\\
&=m_{(4)}+2m_{(2,2)}.
\end{align*}
For $p_{2,1,1}$, write
\begin{align*}
p_{2,1,1}
=\left(\sum_i x_i^2\right)\left(\sum_j x_j\right)\left(\sum_k x_k\right)
=\sum_{i,j,k}x_i^2x_jx_k.
\end{align*}
A fixed monomial $x_a^4$ occurs only from $(i,j,k)=(a,a,a)$, so its coefficient is $1$. A fixed monomial $x_a^3x_b$ with $a\ne b$ occurs from
\begin{align*}
(i,j,k)=(a,a,b),\qquad (a,b,a),
\end{align*}
so its coefficient is $2$. A fixed monomial $x_a^2x_b^2$ occurs from
\begin{align*}
(i,j,k)=(a,b,b),\qquad (b,a,a),
\end{align*}
so its coefficient is $2$. A fixed monomial $x_a^2x_bx_c$ with $a,b,c$ distinct occurs from
\begin{align*}
(i,j,k)=(a,b,c),\qquad (a,c,b),
\end{align*}
so its coefficient is $2$. Hence
\begin{align*}
p_{2,1,1}=m_{(4)}+2m_{(3,1)}+2m_{(2,2)}+2m_{(2,1,1)}.
\end{align*}
Finally,
\begin{align*}
p_{1,1,1,1}=p_1^4=\sum_{i,j,k,l}x_ix_jx_kx_l.
\end{align*}
For a fixed monomial, the coefficient is the number of ordered four-letter words with the required multiplicities:
\begin{align*}
x_a^4 &: \frac{4!}{4!}=1,\\
x_a^3x_b &: \frac{4!}{3!1!}=4,\\
x_a^2x_b^2 &: \frac{4!}{2!2!}=6,\\
x_a^2x_bx_c &: \frac{4!}{2!1!1!}=12,\\
x_ax_bx_cx_d &: \frac{4!}{1!1!1!1!}=24.
\end{align*}
Therefore
\begin{align*}
p_{1,1,1,1}=m_{(4)}+4m_{(3,1)}+6m_{(2,2)}+12m_{(2,1,1)}+24m_{(1,1,1,1)}.
\end{align*}
Thus the coefficient of each monomial orbit records exactly how many ordered parts of the indexing partition have been assigned to each variable.
[/example]
The monomial expansion shows that power sums are triangular, but it does not yet connect them to the elementary and complete generators. Naively expanding every product $p_\lambda$ into monomials gives transition coefficients by grouping equal variables, which is workable in degree $4$ but poorly suited to recursive arguments. The next theorem is needed to make this connection explicit: it converts logarithmic derivatives of $E(t)$ and $H(t)$ into recurrences involving the functions $p_r$.
[quotetheorem:5188]
[citeproof:5188]
The first identity recursively expresses $p_n$ in terms of $e_1,\dots,e_n$, and it expresses $e_n$ in terms of $p_1,\dots,p_n$ once division by $n$ is allowed. The second identity similarly expresses $p_n$ in terms of $h_1,\dots,h_n$ and solves for $h_n$ only when division by $n$ is available. The coefficient $n$ is therefore a limitation, not a cosmetic factor: in characteristic $2$, the first identity at $n=2$ becomes $e_1p_1-p_2=0$, so $e_2$ disappears, while the second becomes $p_1h_1+p_2=0$, so $h_2$ disappears.
The two identities also depend on using the same alphabet and the same formal series as before. If $p_r$ is computed from variables $x_i$ but $E(t)$ from variables $y_i$, the logarithmic derivative of $E(t)$ produces power sums in the $y$-alphabet rather than the $x$-alphabet, and the displayed recurrence has mismatched terms. Finally, these are one-row recurrences; they do not by themselves prove that the products $p_\lambda$ are independent. This is why the next theorem adds the characteristic-zero hypothesis and a basis-counting argument.
[quotetheorem:5189]
[citeproof:5189]
Characteristic zero is essential for this proof strategy and for the statement as written. In characteristic $p$, the equation for $e_p$ contains the coefficient $p$, so the recursion may lose information. For example, over a field of characteristic $2$, the first Newton identity with $n=2$ becomes $e_1p_1-p_2=0$ and no longer solves for $e_2$; equivalently, $p_1^2=p_2$ in the Frobenius sense, so the two degree-$2$ products $p_{1,1}$ and $p_2$ are not independent. The degreewise hypothesis is also necessary: $\{p_\lambda:|\lambda|=n\}$ has the correct finite size only inside $\operatorname{Sym}^n_K$, whereas the union over all degrees is a graded basis, not a basis of a single homogeneous component.
The theorem therefore does not assert that $\{p_\lambda\}$ is a basis over $\mathbb Z$ or over arbitrary fields; it is a statement after tensoring with a characteristic-zero field. Its forward role is to justify rational transition matrices, which is why the final section can invert the degree $4$ matrices over $\mathbb Q$ and why denominators appear in the power-sum coordinates. The same normalization of power sums will also be used in Chapter 3 to define the Hall inner product by the constants $z_\lambda$.
## Degree Four Transition Matrices
How can the abstract basis theorems be checked in a concrete homogeneous component? Degree $4$ is the first degree where all five partition shapes occur, so it gives a compact test case for every definition above. We order partitions as
\begin{align*}
(4),(3,1),(2,2),(2,1,1),(1,1,1,1).
\end{align*}
[example: Degree Four Matrices to the Monomial Basis]
Use the partition order
\begin{align*}
(4),(3,1),(2,2),(2,1,1),(1,1,1,1),
\end{align*}
for both rows and monomial-basis columns. A row of a transition matrix is the coefficient vector of the corresponding basis element after expansion in
\begin{align*}
m_{(4)},\ m_{(3,1)},\ m_{(2,2)},\ m_{(2,1,1)},\ m_{(1,1,1,1)}.
\end{align*}
For the monomial basis itself, each $m_\lambda$ is one basis vector, so the rows are the standard coordinate vectors:
\begin{align*}
M_{m\to m}&=
\begin{pmatrix}
1&0&0&0&0\\
0&1&0&0&0\\
0&0&1&0&0\\
0&0&0&1&0\\
0&0&0&0&1
\end{pmatrix}.
\end{align*}
For the elementary products, the degree four expansions are
\begin{align*}
e_4&=m_{(1,1,1,1)},\\
e_{3,1}&=m_{(2,1,1)}+4m_{(1,1,1,1)},\\
e_{2,2}&=m_{(2,2)}+2m_{(2,1,1)}+6m_{(1,1,1,1)},\\
e_{2,1,1}&=m_{(3,1)}+2m_{(2,2)}+5m_{(2,1,1)}+12m_{(1,1,1,1)},\\
e_{1,1,1,1}&=m_{(4)}+4m_{(3,1)}+6m_{(2,2)}+12m_{(2,1,1)}+24m_{(1,1,1,1)}.
\end{align*}
Reading these five coefficient vectors in the stated row order gives
\begin{align*}
M_{e\to m}&=
\begin{pmatrix}
0&0&0&0&1\\
0&0&0&1&4\\
0&0&1&2&6\\
0&1&2&5&12\\
1&4&6&12&24
\end{pmatrix}.
\end{align*}
For the complete homogeneous products, each coefficient counts ordered decompositions of a fixed degree four exponent vector into row contributions of the prescribed sizes. The resulting expansions are
\begin{align*}
h_4&=m_{(4)}+m_{(3,1)}+m_{(2,2)}+m_{(2,1,1)}+m_{(1,1,1,1)},\\
h_{3,1}&=m_{(4)}+2m_{(3,1)}+2m_{(2,2)}+3m_{(2,1,1)}+4m_{(1,1,1,1)},\\
h_{2,2}&=m_{(4)}+2m_{(3,1)}+3m_{(2,2)}+4m_{(2,1,1)}+6m_{(1,1,1,1)},\\
h_{2,1,1}&=m_{(4)}+3m_{(3,1)}+4m_{(2,2)}+7m_{(2,1,1)}+12m_{(1,1,1,1)},\\
h_{1,1,1,1}&=m_{(4)}+4m_{(3,1)}+6m_{(2,2)}+12m_{(2,1,1)}+24m_{(1,1,1,1)}.
\end{align*}
Thus
\begin{align*}
M_{h\to m}&=
\begin{pmatrix}
1&1&1&1&1\\
1&2&2&3&4\\
1&2&3&4&6\\
1&3&4&7&12\\
1&4&6&12&24
\end{pmatrix}.
\end{align*}
For the power sums, expand the defining products and collect equal monomial shapes:
\begin{align*}
p_4&=m_{(4)},\\
p_{3,1}
&=\left(\sum_i x_i^3\right)\left(\sum_j x_j\right)
=\sum_i x_i^4+\sum_{i\ne j}x_i^3x_j
=m_{(4)}+m_{(3,1)},\\
p_{2,2}
&=\left(\sum_i x_i^2\right)\left(\sum_j x_j^2\right)
=\sum_i x_i^4+\sum_{i\ne j}x_i^2x_j^2
=m_{(4)}+2m_{(2,2)}.
\end{align*}
For
\begin{align*}
p_{2,1,1}=\sum_{i,j,k}x_i^2x_jx_k,
\end{align*}
the fixed monomial $x_a^4$ occurs from $(i,j,k)=(a,a,a)$, the fixed monomial $x_a^3x_b$ occurs from $(a,a,b)$ and $(a,b,a)$, the fixed monomial $x_a^2x_b^2$ occurs from $(a,b,b)$ and $(b,a,a)$, and the fixed monomial $x_a^2x_bx_c$ occurs from $(a,b,c)$ and $(a,c,b)$. Hence
\begin{align*}
p_{2,1,1}=m_{(4)}+2m_{(3,1)}+2m_{(2,2)}+2m_{(2,1,1)}.
\end{align*}
Finally,
\begin{align*}
p_{1,1,1,1}=p_1^4=m_{(4)}+4m_{(3,1)}+6m_{(2,2)}+12m_{(2,1,1)}+24m_{(1,1,1,1)},
\end{align*}
because the coefficients are the multinomial counts
\begin{align*}
\frac{4!}{4!}=1,\qquad
\frac{4!}{3!1!}=4,\qquad
\frac{4!}{2!2!}=6,\qquad
\frac{4!}{2!1!1!}=12,\qquad
\frac{4!}{1!1!1!1!}=24.
\end{align*}
Therefore
\begin{align*}
M_{p\to m}&=
\begin{pmatrix}
1&0&0&0&0\\
1&1&0&0&0\\
1&0&2&0&0\\
1&2&2&2&0\\
1&4&6&12&24
\end{pmatrix}.
\end{align*}
For instance, the fourth row of $M_{p\to m}$ is exactly the coefficient vector of
\begin{align*}
p_{2,1,1}=m_{(4)}+2m_{(3,1)}+2m_{(2,2)}+2m_{(2,1,1)}.
\end{align*}
[/example]
These matrices determine all transition matrices among the four bases by inversion and multiplication over $\mathbb Q$. The determinant of each displayed matrix is nonzero, matching the basis theorems in degree $4$.
[example: Degree Four Inverse Monomial Coordinates]
Use the same partition order
\begin{align*}
(4),(3,1),(2,2),(2,1,1),(1,1,1,1).
\end{align*}
Since the transition matrices in the previous example send $e$-, $h$-, and $p$-coordinates to monomial coordinates, the inverse matrices are the matrices whose rows solve
\begin{align*}
\text{row}\cdot M_{e\to m}=e_i,\qquad
\text{row}\cdot M_{h\to m}=e_i,\qquad
\text{row}\cdot M_{p\to m}=e_i,
\end{align*}
where $e_i$ denotes the $i$th standard row vector.
For the elementary basis, the inverse is
\begin{align*}
M_{m\to e}&=
\begin{pmatrix}
-4&4&2&-4&1\\
4&-1&-2&1&0\\
2&-2&1&0&0\\
-4&1&0&0&0\\
1&0&0&0&0
\end{pmatrix}.
\end{align*}
For instance, the first row is verified by multiplying it by the columns of $M_{e\to m}$:
\begin{align*}
(-4,4,2,-4,1)
\begin{pmatrix}0\\0\\0\\0\\1\end{pmatrix}&=1,\\
(-4,4,2,-4,1)
\begin{pmatrix}0\\0\\0\\1\\4\end{pmatrix}&=-4+4=0,\\
(-4,4,2,-4,1)
\begin{pmatrix}0\\0\\1\\2\\6\end{pmatrix}&=2-8+6=0,\\
(-4,4,2,-4,1)
\begin{pmatrix}0\\1\\2\\5\\12\end{pmatrix}&=4+4-20+12=0,\\
(-4,4,2,-4,1)
\begin{pmatrix}1\\4\\6\\12\\24\end{pmatrix}&=-4+16+12-48+24=0.
\end{align*}
Thus the first row gives $m_{(4)}=-4e_4+4e_{3,1}+2e_{2,2}-4e_{2,1,1}+e_{1,1,1,1}$, and the other rows are checked in the same row-column way.
For the complete homogeneous basis, the inverse is
\begin{align*}
M_{m\to h}&=
\begin{pmatrix}
4&-4&-2&4&-1\\
-4&7&2&-7&2\\
-2&2&3&-4&1\\
4&-7&-4&10&-3\\
-1&2&1&-3&1
\end{pmatrix}.
\end{align*}
For example, the last row is checked against the columns of $M_{h\to m}$:
\begin{align*}
(-1,2,1,-3,1)
\begin{pmatrix}1\\1\\1\\1\\1\end{pmatrix}&=-1+2+1-3+1=0,\\
(-1,2,1,-3,1)
\begin{pmatrix}1\\2\\2\\3\\4\end{pmatrix}&=-1+4+2-9+4=0,\\
(-1,2,1,-3,1)
\begin{pmatrix}1\\2\\3\\4\\6\end{pmatrix}&=-1+4+3-12+6=0,\\
(-1,2,1,-3,1)
\begin{pmatrix}1\\3\\4\\7\\12\end{pmatrix}&=-1+6+4-21+12=0,\\
(-1,2,1,-3,1)
\begin{pmatrix}1\\4\\6\\12\\24\end{pmatrix}&=-1+8+6-36+24=1.
\end{align*}
Therefore $m_{(1,1,1,1)}=-h_4+2h_{3,1}+h_{2,2}-3h_{2,1,1}+h_{1,1,1,1}$.
For the power-sum basis, the inverse is
\begin{align*}
M_{m\to p}&=
\begin{pmatrix}
1&0&0&0&0\\
-1&1&0&0&0\\
-\frac12&0&\frac12&0&0\\
1&-1&-\frac12&\frac12&0\\
-\frac14&\frac13&\frac18&-\frac14&\frac1{24}
\end{pmatrix}.
\end{align*}
The last row gives the coefficient vector of $m_{(1,1,1,1)}$ in the power-sum basis, and multiplying by the columns of $M_{p\to m}$ gives
\begin{align*}
\left(-\frac14,\frac13,\frac18,-\frac14,\frac1{24}\right)
\begin{pmatrix}1\\1\\1\\1\\1\end{pmatrix}
&=-\frac14+\frac13+\frac18-\frac14+\frac1{24}=0,\\
\left(-\frac14,\frac13,\frac18,-\frac14,\frac1{24}\right)
\begin{pmatrix}0\\1\\0\\2\\4\end{pmatrix}
&=\frac13-\frac12+\frac16=0,\\
\left(-\frac14,\frac13,\frac18,-\frac14,\frac1{24}\right)
\begin{pmatrix}0\\0\\2\\2\\6\end{pmatrix}
&=\frac14-\frac12+\frac14=0,\\
\left(-\frac14,\frac13,\frac18,-\frac14,\frac1{24}\right)
\begin{pmatrix}0\\0\\0\\2\\12\end{pmatrix}
&=-\frac12+\frac12=0,\\
\left(-\frac14,\frac13,\frac18,-\frac14,\frac1{24}\right)
\begin{pmatrix}0\\0\\0\\0\\24\end{pmatrix}
&=1.
\end{align*}
Hence
\begin{align*}
m_{(1,1,1,1)}=-\frac14p_4+\frac13p_{3,1}+\frac18p_{2,2}-\frac14p_{2,1,1}+\frac1{24}p_{1,1,1,1}.
\end{align*}
The denominators in this last formula come from inverting the integer transition matrix for the power sums, so these coordinates naturally live over $\mathbb Q$.
[/example]
Combining the preceding two examples gives, for instance, $M_{e\to h}=M_{e\to m}M_{m\to h}$ and $M_{h\to p}=M_{h\to m}M_{m\to p}$. Thus every degree $4$ conversion among $m,e,h,p$ is obtained by matrix multiplication from the displayed data. The point of writing the matrices is not memorisation; it is to see triangularity, integrality for $e$ and $h$, and the rational denominators forced by the power-sum basis.
After fixing the standard bases, the next structural layer is duality: which bases pair naturally, which operators have adjoints, and how the Hall inner product organizes these relations. Chapter 3 develops that bilinear framework so that later identities can be read both algebraically and combinatorially.
# 3. Inner Products, Involutions, and Duality
This chapter adds a bilinear form to the ring of symmetric functions. It assumes the earlier construction of $\operatorname{Sym}$ in Chapter 1, the classical bases $p_\lambda$, $h_\lambda$, $e_\lambda$, and $m_\lambda$ from Chapter 2, and the generating-function identities relating them. The point is not to introduce extra structure for its own sake, but to make the main bases interact through adjointness and duality. The power-sum basis gives the cleanest formula for the form, while the monomial and complete homogeneous bases reveal how coefficient extraction and multiplication are paired. The chapter ends with the involution $\omega$, which explains why elementary and complete homogeneous symmetric functions behave as mirror images.
## The Hall Inner Product from Power Sums
The first question is how to choose an inner product on $\operatorname{Sym}$ that respects the grading and turns the power sums into independent coordinates. Since the power sums multiply according to partitions, the natural answer is to make the basis $(p_\lambda)$ orthogonal, but with a normalization that remembers the multiplicities of equal parts.
A naive orthonormal choice $\langle p_\lambda,p_\mu\rangle=\delta_{\lambda,\mu}$ would make the power sums look simple, but it would break the Cauchy kernel normalization used to relate symmetric functions to coefficient extraction. The missing factor is exactly the symmetry factor of a permutation with cycle type $\lambda$. For a partition $\lambda = (1^{m_1}2^{m_2}\cdots)$, this factor is the order of the centraliser of such a permutation.
[definition: Partition Centraliser Constant]
Let $\operatorname{Par}$ be the set of partitions. The partition centraliser constant is the map $z: \operatorname{Par} \to \mathbb{Z}_{>0}$ defined as follows: for $\lambda = (1^{m_1}2^{m_2}\cdots)$,
\begin{align*}
z_\lambda = \prod_{i \ge 1} i^{m_i}m_i!.
\end{align*}
[/definition]
The factors $i^{m_i}$ record rotations within cycles of length $i$, and $m_i!$ records permutation of cycles with the same length. This constant is also the denominator that appears when class functions on symmetric groups are expressed in the power-sum basis, which is why it is the correct normalization for the Hall form.
[example: Centraliser Constants in Degree Four]
For a partition $\lambda=(1^{m_1}2^{m_2}\cdots)$, the partition centraliser constant is
\begin{align*}
z_\lambda=\prod_{i\geq 1} i^{m_i}m_i!.
\end{align*}
The five partitions of $4$ give the following multiplicities and constants:
\begin{align*}
z_{(4)}
&=4^1\cdot 1!
=4,\\
z_{(3,1)}
&=3^1\cdot 1!\cdot 1^1\cdot 1!
=3,\\
z_{(2,2)}
&=2^2\cdot 2!
=4\cdot 2
=8,\\
z_{(2,1,1)}
&=2^1\cdot 1!\cdot 1^2\cdot 2!
=2\cdot 2
=4,\\
z_{(1,1,1,1)}
&=1^4\cdot 4!
=24.
\end{align*}
Thus $z_\lambda$ records both the sizes of the parts and their repetitions: $(4)$ has one part of size $4$, while $(2,2)$ has two equal parts of size $2$, so the extra factor $2!$ accounts for interchanging those equal parts.
[/example]
We now impose orthogonality in the power-sum basis. This is a definition of the form, but it is also the computational rule from which all later dualities will be derived.
[definition: Hall Inner Product]
The Hall inner product on $\operatorname{Sym}$ is the bilinear form $\langle \cdot, \cdot \rangle: \operatorname{Sym} \times \operatorname{Sym} \to \mathbb{Q}$ determined by
\begin{align*}
\langle p_\lambda, p_\mu \rangle = z_\lambda \delta_{\lambda,\mu}
\end{align*}
for all partitions $\lambda,\mu$.
[/definition]
Because the power sums form a basis over $\mathbb{Q}$, this prescription determines a unique bilinear form. The first structural fact to check is that the form is not degenerate on any homogeneous component, since all adjoint operators later in the chapter depend on uniqueness under this pairing.
[quotetheorem:5190]
[citeproof:5190]
The restriction to a fixed homogeneous component matters: the Hall form pairs different degrees separately, so nondegeneracy is used degree by degree rather than on an infinite formal completion. The hypothesis that the $p_\lambda$ form a basis over $\mathbb{Q}$ is also essential, because the argument reads off the coefficients of $f$ from its pairings with the same basis. The theorem does not say that every familiar basis is orthogonal; it says that every calculation can be moved into power-sum coordinates. This is useful because many identities involving $e_n$ and $h_n$ have compact logarithmic forms in the $p_k$.
[example: Pairing Complete Functions in Degree Two]
Using
\begin{align*}
h_2=\frac{p_1^2+p_2}{2}
\qquad\text{and}\qquad
e_2=\frac{p_1^2-p_2}{2},
\end{align*}
we first note that $p_1^2=p_{(1,1)}$ and $p_2=p_{(2)}$. By bilinearity of the Hall inner product and the defining orthogonality of the power-sum basis,
\begin{align*}
\langle h_2,h_2\rangle
&=\left\langle \frac{p_1^2+p_2}{2},\frac{p_1^2+p_2}{2}\right\rangle\\
&=\frac{1}{4}\left(
\langle p_1^2,p_1^2\rangle
+\langle p_1^2,p_2\rangle
+\langle p_2,p_1^2\rangle
+\langle p_2,p_2\rangle
\right)\\
&=\frac{1}{4}\left(
z_{(1,1)}
+0
+0
+z_{(2)}
\right).
\end{align*}
The constants are
\begin{align*}
z_{(1,1)}=1^2\cdot 2!=2,
\qquad
z_{(2)}=2^1\cdot 1!=2,
\end{align*}
so
\begin{align*}
\langle h_2,h_2\rangle
&=\frac{1}{4}(2+2)
=1.
\end{align*}
For the mixed pairing, the sign of the $p_2$ term changes:
\begin{align*}
\langle e_2,h_2\rangle
&=\left\langle \frac{p_1^2-p_2}{2},\frac{p_1^2+p_2}{2}\right\rangle\\
&=\frac{1}{4}\left(
\langle p_1^2,p_1^2\rangle
+\langle p_1^2,p_2\rangle
-\langle p_2,p_1^2\rangle
-\langle p_2,p_2\rangle
\right)\\
&=\frac{1}{4}\left(
z_{(1,1)}
+0
-0
-z_{(2)}
\right)\\
&=\frac{1}{4}(2-2)
=0.
\end{align*}
Thus $h_2$ has Hall norm $1$, while $e_2$ is orthogonal to $h_2$ in degree $2$ because the two equal diagonal contributions cancel.
[/example]
## Complete and Monomial Duality
The next problem is to interpret the Hall inner product in a basis that is closer to enumeration. Power-sum coordinates are excellent for diagonalizing the form, but they do not directly tell us how to read coefficients in an expansion by complete homogeneous functions. Without a dual basis statement, multiplication and coefficient extraction would remain separate operations with no uniform pairing rule. The monomial basis records orbit sums of individual exponent patterns, while the complete homogeneous basis records weakly increasing choices of variables, and their pairing is designed to extract a single coefficient.
Before stating the duality, recall that $h_\lambda = h_{\lambda_1}\cdots h_{\lambda_\ell}$ and $m_\mu$ is the sum of all distinct monomials whose exponent partition is $\mu$. The duality asserts that these two very different constructions are exactly matched by the Hall form.
[quotetheorem:5191]
[citeproof:5191]
This result is the main reason the Hall form is more than an arbitrary normalization. The hypotheses that $\lambda$ and $\mu$ are partitions keep both bases inside the same homogeneous component, where the preceding nondegeneracy result applies. The theorem does not assert that either the complete basis or the monomial basis is orthogonal by itself; instead, it says that they are paired across two different combinatorial descriptions of the same space. This connection between multiplication by complete functions and coefficient extraction against monomial functions will later be repackaged as skewing operators and Schur duality.
[example: Degree Three Pairings]
In degree $3$, the partitions are $(3)$, $(2,1)$, and $(1,1,1)$, so the complete basis is $h_3,h_{2,1},h_{1,1,1}$ and the monomial basis is $m_3,m_{2,1},m_{1,1,1}$. For
\begin{align*}
f = 2h_3-5h_{2,1}+h_{1,1,1},
\end{align*}
bilinearity of the Hall inner product in the first variable gives
\begin{align*}
\langle f,m_{2,1}\rangle
&=\langle 2h_3-5h_{2,1}+h_{1,1,1},m_{2,1}\rangle\\
&=2\langle h_3,m_{2,1}\rangle
-5\langle h_{2,1},m_{2,1}\rangle
+\langle h_{1,1,1},m_{2,1}\rangle.
\end{align*}
By *Complete Monomial Duality*, $\langle h_\lambda,m_\mu\rangle=\delta_{\lambda,\mu}$, so
\begin{align*}
\langle h_3,m_{2,1}\rangle &= \delta_{(3),(2,1)}=0,\\
\langle h_{2,1},m_{2,1}\rangle &= \delta_{(2,1),(2,1)}=1,\\
\langle h_{1,1,1},m_{2,1}\rangle &= \delta_{(1,1,1),(2,1)}=0.
\end{align*}
Substituting these three values,
\begin{align*}
\langle f,m_{2,1}\rangle
&=2\cdot 0-5\cdot 1+0\\
&=-5.
\end{align*}
Thus pairing with $m_{2,1}$ selects the coefficient of $h_{2,1}$ in the complete-basis expansion of $f$, even though the Hall inner product is not diagonal in every familiar basis.
[/example]
The constants $z_\lambda$ can now be seen from two viewpoints. In the power-sum basis they are the diagonal norms, while in the Cauchy kernel they are the denominators that make the same kernel reproduce the Hall pairing. This kernel viewpoint is developed systematically in Chapter 6, where the complete-monomial expansion becomes the Cauchy identity.
[remark: Why the Normalization Is Forced]
If the power sums were made orthonormal instead, the exponential Cauchy kernel would not match the product kernel $\prod_{i,j}(1-x_i y_j)^{-1}$ without extra factors. The choice $\langle p_\lambda,p_\lambda\rangle=z_\lambda$ makes the combinatorial expansion by monomial functions and the algebraic expansion by power sums agree.
[/remark]
## Adjoint Operators in Small Degrees
Once an inner product is fixed, multiplication operators have adjoints. Without adjoints, there is no intrinsic way to reverse multiplication by a symmetric function: deleting a part from a partition depends on the basis, and coefficient extraction alone does not produce an operator on $\operatorname{Sym}$. The Hall form turns this vague inverse operation into a well-defined linear operator. The guiding question is how these adjoints act on familiar bases, especially because later skew symmetric functions will be defined through this adjointness.
[definition: Adjoint Multiplication Operator]
Let $f \in \operatorname{Sym}$. The operator $f^\perp: \operatorname{Sym} \to \operatorname{Sym}$ is defined by
\begin{align*}
\langle f^\perp g, h\rangle = \langle g,fh\rangle
\end{align*}
for all $g,h \in \operatorname{Sym}$.
[/definition]
Nondegeneracy on each homogeneous degree guarantees that $f^\perp g$ is uniquely determined, and its degree is $\deg(g)-\deg(f)$ when $g$ is homogeneous of degree at least $\deg(f)$. To compute these operators rather than merely define them, we first ask what happens for multiplication by a single power sum, where the partition combinatorics is visible.
[quotetheorem:5192]
[citeproof:5192]
The multiplicity hypothesis is essential: if no part $k$ occurs in $\lambda$, there is no partition $\mu$ whose product $p_kp_\mu$ has type $\lambda$, so the adjoint must vanish. The theorem does not say that $p_k^\perp$ is ordinary deletion of a part; the Hall normalization contributes the factor $k m_k$. This formula should be read as a differential rule: $p_k^\perp$ behaves like $k\partial/\partial p_k$ on the polynomial algebra $\mathbb{Q}[p_1,p_2,\dots]$. That differential interpretation is the model for later skewing operators, where adjoints remove shapes rather than individual power-sum parts.
[example: Adjoint to Multiplication by $p_1$ and $p_2$]
By *Power Sum Adjoint Formula*, if $\lambda=(1^{m_1}2^{m_2}\cdots)$, then
\begin{align*}
p_k^\perp p_\lambda = k m_k p_{\lambda\setminus k}.
\end{align*}
For $\lambda=(2,1,1)$ and $k=1$, the multiplicity of the part $1$ is $m_1=2$, and removing one part equal to $1$ gives $\lambda\setminus 1=(2,1)$. Hence
\begin{align*}
p_1^\perp p_{2,1,1}
&=1\cdot 2\cdot p_{2,1}\\
&=2p_{2,1}.
\end{align*}
For $\lambda=(3)$ and $k=1$, the multiplicity of the part $1$ is $m_1=0$, so
\begin{align*}
p_1^\perp p_3
&=1\cdot 0\cdot p_{\lambda\setminus 1}\\
&=0.
\end{align*}
For $p_2^\perp$, the same rule gives a factor $2m_2$. If $\lambda=(2,2,1)$, then $m_2=2$ and $\lambda\setminus 2=(2,1)$, so
\begin{align*}
p_2^\perp p_{2,2,1}
&=2\cdot 2\cdot p_{2,1}\\
&=4p_{2,1}.
\end{align*}
If $\lambda=(3,1)$, then no part is equal to $2$, so $m_2=0$ and
\begin{align*}
p_2^\perp p_{3,1}
&=2\cdot 0\cdot p_{\lambda\setminus 2}\\
&=0.
\end{align*}
The normalization is visible in the nonzero cases: the adjoint deletes one matching part, but the Hall inner product also contributes the factor $k m_k$.
[/example]
Adjoints to multiplication by $h_k$ are less diagonal in the power-sum basis, but the complete-monomial duality makes their meaning transparent once the correct test functions are chosen. In small degrees, direct conversion to power sums is the most economical method.
[example: Adjoint to Multiplication by $h_1$ in Degree Two]
Let $g=h_2$. Since $h_1=p_1$ and
\begin{align*}
h_2=\frac{p_1^2+p_2}{2}
=\frac{p_{(1,1)}+p_{(2)}}{2},
\end{align*}
linearity of $p_1^\perp$ gives
\begin{align*}
h_1^\perp h_2
&=p_1^\perp h_2\\
&=p_1^\perp\left(\frac{p_{(1,1)}+p_{(2)}}{2}\right)\\
&=\frac{1}{2}\left(p_1^\perp p_{(1,1)}+p_1^\perp p_{(2)}\right).
\end{align*}
By *Power Sum Adjoint Formula*, $p_k^\perp p_\lambda=k m_k p_{\lambda\setminus k}$, where $m_k$ is the multiplicity of the part $k$ in $\lambda$. For $\lambda=(1,1)$, the part $1$ has multiplicity $m_1=2$, and removing one part $1$ gives $(1)$, so
\begin{align*}
p_1^\perp p_{(1,1)}
&=1\cdot 2\cdot p_{(1)}\\
&=2p_1.
\end{align*}
For $\lambda=(2)$, the part $1$ has multiplicity $m_1=0$, so
\begin{align*}
p_1^\perp p_{(2)}
&=1\cdot 0\cdot p_{\lambda\setminus 1}\\
&=0.
\end{align*}
Substituting these two values,
\begin{align*}
h_1^\perp h_2
&=\frac{1}{2}\left(2p_1+0\right)\\
&=p_1\\
&=h_1.
\end{align*}
Thus the adjoint to multiplication by $h_1$ removes one degree from $h_2$ and leaves the unique complete homogeneous function in degree $1$.
[/example]
The same method works for larger complete functions, although the expansions become longer. The example below is the first nontrivial case where both $p_1^2$ and $p_2$ contribute to the adjoint.
[example: Adjoint to Multiplication by $h_2$ in Degree Three]
Since
\begin{align*}
h_2
&=\frac{p_1^2+p_2}{2}
=\frac{p_{(1,1)}+p_{(2)}}{2},
\end{align*}
linearity of adjoints and $(p_1^2)^\perp=(p_1^\perp)^2$ give
\begin{align*}
h_2^\perp
&=\frac{1}{2}\left((p_1^\perp)^2+p_2^\perp\right).
\end{align*}
We apply this to
\begin{align*}
h_3
&=\frac{p_1^3+3p_1p_2+2p_3}{6}\\
&=\frac{p_{(1,1,1)}+3p_{(2,1)}+2p_{(3)}}{6}.
\end{align*}
By *Power Sum Adjoint Formula*, $p_k^\perp p_\lambda=km_kp_{\lambda\setminus k}$. For the $p_1^\perp$ part,
\begin{align*}
p_1^\perp h_3
&=\frac{1}{6}\left(p_1^\perp p_{(1,1,1)}
+3p_1^\perp p_{(2,1)}
+2p_1^\perp p_{(3)}\right)\\
&=\frac{1}{6}\left(1\cdot 3\cdot p_{(1,1)}
+3\cdot 1\cdot 1\cdot p_{(2)}
+2\cdot 1\cdot 0\right)\\
&=\frac{1}{6}\left(3p_{(1,1)}+3p_{(2)}\right)\\
&=\frac{p_{(1,1)}+p_{(2)}}{2}\\
&=h_2.
\end{align*}
Applying $p_1^\perp$ once more,
\begin{align*}
(p_1^\perp)^2h_3
&=p_1^\perp h_2\\
&=p_1^\perp\left(\frac{p_{(1,1)}+p_{(2)}}{2}\right)\\
&=\frac{1}{2}\left(1\cdot 2\cdot p_{(1)}+1\cdot 0\right)\\
&=p_1.
\end{align*}
For the $p_2^\perp$ part,
\begin{align*}
p_2^\perp h_3
&=\frac{1}{6}\left(p_2^\perp p_{(1,1,1)}
+3p_2^\perp p_{(2,1)}
+2p_2^\perp p_{(3)}\right)\\
&=\frac{1}{6}\left(2\cdot 0
+3\cdot 2\cdot 1\cdot p_{(1)}
+2\cdot 2\cdot 0\right)\\
&=\frac{6p_1}{6}\\
&=p_1.
\end{align*}
Substituting into the operator formula,
\begin{align*}
h_2^\perp h_3
&=\frac{1}{2}\left((p_1^\perp)^2h_3+p_2^\perp h_3\right)\\
&=\frac{1}{2}(p_1+p_1)\\
&=p_1\\
&=h_1.
\end{align*}
Thus, in this degree-three computation, the adjoint to multiplication by $h_2$ removes two degrees from $h_3$ and leaves the unique complete homogeneous function of degree $1$.
[/example]
## The Omega Involution
The final question is why the elementary and complete homogeneous bases occur in parallel throughout the theory. Treating the two bases separately would duplicate many arguments: every identity for $h_n$ would need a second proof for $e_n$, and the relationship between the two versions would remain hidden. Their generating functions are inverse after changing the sign of the variable, so there should be an algebra automorphism that exchanges them.
[definition: Omega Involution]
The omega map is the graded algebra homomorphism $\omega: \operatorname{Sym} \to \operatorname{Sym}$ determined by
\begin{align*}
\omega(e_n)=h_n
\end{align*}
for every $n\ge 1$.
[/definition]
Since $\operatorname{Sym}=\mathbb{Q}[e_1,e_2,\dots]$, the assignment on the algebraically independent generators $e_n$ defines a unique algebra homomorphism. To use $\omega$ in computations, we need its effect on the other classical generators, especially the power sums that diagonalize the Hall product.
[quotetheorem:5193]
[citeproof:5193]
The theorem uses that $\omega$ is a graded algebra homomorphism, so knowing its values on the generators determines its values on products such as $h_\lambda$ and $p_\lambda$. It does not say that every basis vector is fixed or merely renamed; in the power-sum basis the action is diagonal with signs depending on parity. This signed diagonal form is what makes later compatibility with the Hall product transparent. It is also the algebraic shadow of transposing Young diagrams, which becomes visible in Chapter 4 when Schur functions are introduced through semistandard tableaux and in Chapter 5 through the dual Jacobi-Trudi identity.
[example: Omega in Degree Three]
In degree $3$, [Omega on Classical Bases] gives
\begin{align*}
\omega(p_n)=(-1)^{n-1}p_n
\end{align*}
for every $n\geq 1$, and multiplicativity gives the value on products. Therefore
\begin{align*}
\omega(p_3)
&=(-1)^{3-1}p_3\\
&=(-1)^2p_3\\
&=p_3,
\end{align*}
while
\begin{align*}
\omega(p_2p_1)
&=\omega(p_2)\omega(p_1)\\
&=\left((-1)^{2-1}p_2\right)\left((-1)^{1-1}p_1\right)\\
&=(-p_2)(p_1)\\
&=-p_2p_1,
\end{align*}
and
\begin{align*}
\omega(p_1^3)
&=\omega(p_1)^3\\
&=\left((-1)^{1-1}p_1\right)^3\\
&=p_1^3.
\end{align*}
The same theorem gives $\omega(h_n)=e_n$ for every $n\geq 1$, so
\begin{align*}
\omega(h_3)
&=e_3,
\end{align*}
and, since $h_{2,1}=h_2h_1$,
\begin{align*}
\omega(h_{2,1})
&=\omega(h_2h_1)\\
&=\omega(h_2)\omega(h_1)\\
&=e_2e_1.
\end{align*}
Equivalently, for a power-sum product $p_\lambda$, the sign is $(-1)^{|\lambda|-\ell(\lambda)}$: for $(3)$ this exponent is $3-1=2$, for $(2,1)$ it is $3-2=1$, and for $(1,1,1)$ it is $3-3=0$.
[/example]
It remains to check that $\omega$ preserves the Hall inner product. This matters because it means that every adjointness relation has an elementary version and a complete homogeneous version of the same strength.
[quotetheorem:5194]
[citeproof:5194]
The involution statement depends on the two generator families being exchanged in both directions; a map sending $e_n$ to $h_n$ would not be useful for duality unless it could also be reversed. The isometry statement is stronger than preserving degrees or products: it says that adjointness relations survive after applying $\omega$. It does not mean that $\omega$ fixes each symmetric function, as the sign change on $p_2$ already shows. In later computations, this isometry property lets elementary and complete homogeneous formulas be read as paired versions of one another.
[remark: Duality Transported by Omega]
Since $\omega$ is an isometry and $\omega(h_n)=e_n$, adjoint operators involving complete homogeneous functions have elementary-function counterparts. This is often the fastest way to translate a calculation involving complete functions into the corresponding calculation involving elementary functions; the formal identity is $e_n^\perp=\omega h_n^\perp\omega$.
[/remark]
The chapter has built the linear algebra needed for the rest of the course. The Hall inner product makes the power sums orthogonal, the complete and monomial bases dual, adjoint operators computable, and the omega involution an isometry. These tools will be used repeatedly when Schur functions are introduced through determinants, tableaux, and Cauchy identities.
With inner products, adjoints, and the omega involution in place, the stage is set for the central basis of the theory: Schur functions. Chapter 4 introduces them through tableaux, giving a combinatorial definition that links the algebra of $\operatorname{Sym}$ to partitions and Young diagrams.
# 4. Young Tableaux and Schur Functions
This chapter introduces Schur functions through semistandard Young tableaux. Chapters 1 and 2 built the stable ring of symmetric functions and its classical bases; here the new feature is that a basis element is defined by a generating function over fillings of a Young diagram. The main questions are why this tableau generating function is symmetric, how its monomial expansion is controlled by Kostka numbers, and why the resulting functions form an integral basis of $\operatorname{Sym}$.
## Semistandard Tableaux and Weights
What kind of filling of a Young diagram produces a symmetric function rather than an arbitrary formal power series? If all fillings of a fixed shape were allowed, the shape would impose no useful order: in a two-box column, both fillings $1$ above $2$ and $2$ above $1$ would contribute independently, and later basis changes would not be controlled by the geometry of the diagram. The semistandard rule is the compromise that permits repetitions along rows but restricts them down columns, so that the weight records the multiset of entries while the shape records the partition being filled.
[definition: Semistandard Young Tableau]
Let $\lambda$ be a partition. A semistandard Young tableau of shape $\lambda$ is a filling $T$ of the boxes of the Young diagram of $\lambda$ by positive integers such that:
1. entries weakly increase from left to right along each row;
2. entries strictly increase from top to bottom down each column.
The set of all semistandard Young tableaux of shape $\lambda$ is denoted $\operatorname{SSYT}(\lambda)$.
[/definition]
The row condition permits equal entries to appear horizontally, while the column condition prevents equal entries from stacking vertically. To build a generating function, we need to turn each allowed filling into a monomial, and this motivates recording how many times each label appears.
[example: Row and Column Shapes]
For the row shape $(3)$ with entries in $\{1,2,3\}$, a filling is a single row
\begin{align*}
\begin{array}{ccc}
a & b & c
\end{array}
\end{align*}
and the semistandard condition is exactly $a \le b \le c$. Thus the allowed rows are
\begin{align*}
\begin{array}{ccc}1&1&1\end{array},\quad
\begin{array}{ccc}1&1&2\end{array},\quad
\begin{array}{ccc}1&1&3\end{array},\quad
\begin{array}{ccc}1&2&2\end{array},\quad
\begin{array}{ccc}1&2&3\end{array},\quad
\begin{array}{ccc}1&3&3\end{array},\quad
\begin{array}{ccc}2&2&2\end{array},\quad
\begin{array}{ccc}2&2&3\end{array},\quad
\begin{array}{ccc}2&3&3\end{array},\quad
\begin{array}{ccc}3&3&3\end{array}.
\end{align*}
Their tableau monomials are therefore
\begin{align*}
x_1^3,\quad
x_1^2x_2,\quad
x_1^2x_3,\quad
x_1x_2^2,\quad
x_1x_2x_3,\quad
x_1x_3^2,\quad
x_2^3,\quad
x_2^2x_3,\quad
x_2x_3^2,\quad
x_3^3.
\end{align*}
Hence
\begin{align*}
s_{(3)}(x_1,x_2,x_3)
&= x_1^3+x_1^2x_2+x_1^2x_3+x_1x_2^2+x_1x_2x_3+x_1x_3^2+x_2^3+x_2^2x_3+x_2x_3^2+x_3^3 \\
&= \sum_{1\le a\le b\le c\le 3} x_ax_bx_c \\
&= h_3(x_1,x_2,x_3).
\end{align*}
For the column shape $(1,1,1)$, a filling has the form
\begin{align*}
\begin{array}{c}
a\\
b\\
c
\end{array}
\end{align*}
and the semistandard column condition is $a<b<c$. With $a,b,c\in\{1,2,3\}$, the only strictly increasing triple is $(1,2,3)$, because three distinct entries chosen from a three-element ordered set must use all three entries in increasing order. Therefore
\begin{align*}
s_{(1,1,1)}(x_1,x_2,x_3)
&= x_1x_2x_3 \\
&= e_3(x_1,x_2,x_3).
\end{align*}
Thus single-row tableaux recover complete symmetric functions, while single-column tableaux recover elementary symmetric functions.
[/example]
The example shows that the tableau rule extends the complete and elementary bases. It also motivates a uniform definition of weight, because the exponent of $x_i$ should be the number of boxes labelled $i$.
[definition: Weight of a Tableau]
Let $\lambda$ be a partition. The weight map is the function
\begin{align*}
\operatorname{wt}: \operatorname{SSYT}(\lambda) \to \mathbb N^{(\mathbb N)}
\end{align*}
defined by
\begin{align*}
\operatorname{wt}(T)=(\alpha_1,\alpha_2,\dots),
\end{align*}
where $\alpha_i$ is the number of entries of $T$ equal to $i$, and $\mathbb N^{(\mathbb N)}$ denotes the set of finitely supported sequences of nonnegative integers indexed by positive integers.
[/definition]
The weight is a composition of $|\lambda|$, with all but finitely many parts equal to $0$. This motivates the monomial notation used in the tableau generating function, since the exponents are supplied by the weight.
[definition: Tableau Monomial]
Let $\lambda$ be a partition. The tableau monomial map is the function
\begin{align*}
\operatorname{SSYT}(\lambda) &\to \mathbb Z[x_1,x_2,\dots], \\
T &\mapsto x^T,
\end{align*}
where, if $\operatorname{wt}(T)=\alpha$, then
\begin{align*}
x^T = x^{\alpha} = \prod_{i\ge 1} x_i^{\alpha_i}.
\end{align*}
[/definition]
Here $\mathbb Z[x_1,x_2,\dots]$ means the polynomial ring in countably many variables, and the finite support of $\operatorname{wt}(T)$ ensures that $x^T$ is an ordinary finite monomial. This notation separates the shape from the entries. The shape fixes how many boxes must be filled, while the monomial remembers only the multiplicities of the labels.
[example: Weight in Shape Two One]
Consider the tableau of shape $(2,1)$
\begin{align*}
\begin{array}{cc}
1 & 2\\
3 &
\end{array}.
\end{align*}
The row condition holds because $1\le 2$, and the column condition holds because $1<3$. Counting labels gives one $1$, one $2$, one $3$, and no labels $i\ge 4$, so
\begin{align*}
\operatorname{wt}(T)=(1,1,1,0,0,\dots).
\end{align*}
By the definition of tableau monomial,
\begin{align*}
x^T
&=\prod_{i\ge 1}x_i^{\alpha_i} \\
&=x_1^1x_2^1x_3^1x_4^0x_5^0\cdots \\
&=x_1x_2x_3.
\end{align*}
Now consider the different tableau
\begin{align*}
\begin{array}{cc}
1 & 3\\
2 &
\end{array}.
\end{align*}
Here the row condition holds because $1\le 3$, and the column condition holds because $1<2$. This tableau also contains one $1$, one $2$, one $3$, and no labels $i\ge 4$, hence its weight is again
\begin{align*}
(1,1,1,0,0,\dots),
\end{align*}
and its tableau monomial is
\begin{align*}
x_1^1x_2^1x_3^1x_4^0x_5^0\cdots=x_1x_2x_3.
\end{align*}
Thus the weight records the multiplicity of each label, while the tableau monomial forgets where those labels sit inside the Young diagram.
[/example]
This example uses only three labels, which is the natural setting for explicit calculations. Without a bounded alphabet, even a fixed shape has infinitely many fillings, so finite computations would not produce honest polynomials in a chosen set of variables. To compare such finite computations with the stable ring, we need a notation for tableaux whose entries are bounded by $n$.
[definition: Bounded Semistandard Tableaux]
Let $n \in \mathbb N$. The set $\operatorname{SSYT}_n(\lambda)$ consists of semistandard Young tableaux of shape $\lambda$ whose entries lie in $\{1,\dots,n\}$.
[/definition]
The infinite-variable theory is recovered by letting $n$ vary. Since every tableau has finitely many boxes, each individual monomial involves only finitely many variables.
## Schur Functions as Tableau Generating Functions
Which generating function should be attached to a partition shape? The guiding principle is that each semistandard tableau contributes one monomial, so the shape controls the family of fillings and the entries control the variables.
[definition: Schur Function]
Let $\lambda$ be a partition. For $n \in \mathbb N$, the finite Schur polynomial is the element of $\mathbb Z[x_1,\dots,x_n]$ defined by
\begin{align*}
s_\lambda(x_1,\dots,x_n) = \sum_{T \in \operatorname{SSYT}_n(\lambda)} x^T.
\end{align*}
The stable Schur function indexed by $\lambda$ is the homogeneous element $s_\lambda \in \operatorname{Sym}^{|\lambda|}$ represented by the compatible family of finite Schur polynomials $(s_\lambda(x_1,\dots,x_n))_{n\ge 1}$.
[/definition]
The same stable Schur function can also be written as the bounded-degree symmetric formal series
\begin{align*}
s_\lambda = \sum_{T \in \operatorname{SSYT}(\lambda)} x^T.
\end{align*}
The definitions agree under specialization: setting $x_i=0$ for $i>n$ in the stable expression leaves exactly the tableaux with entries at most $n$. The finite polynomial also vanishes when $\lambda$ has more than $n$ parts, because a column of length greater than $n$ cannot be strictly filled using only $1,\dots,n$.
[example: First Schur Functions]
For a single row of length $r$, a tableau has the form
\begin{align*}
\begin{array}{cccc}
a_1 & a_2 & \cdots & a_r
\end{array}
\end{align*}
with
\begin{align*}
1\le a_1\le a_2\le \cdots \le a_r.
\end{align*}
There is no column condition to check, since every column has only one box. Its tableau monomial is
\begin{align*}
x^T=x_{a_1}x_{a_2}\cdots x_{a_r},
\end{align*}
so the tableau generating function is
\begin{align*}
s_{(r)}
&=\sum_{1\le a_1\le a_2\le \cdots \le a_r} x_{a_1}x_{a_2}\cdots x_{a_r}\\
&=h_r,
\end{align*}
by the definition of the complete homogeneous symmetric function $h_r$.
For a single column of length $r$, a tableau has the form
\begin{align*}
\begin{array}{c}
a_1\\
a_2\\
\vdots\\
a_r
\end{array}
\end{align*}
with
\begin{align*}
1\le a_1<a_2<\cdots<a_r.
\end{align*}
There is no horizontal row condition beyond the one-box rows, and the strict inequalities are exactly the column condition. Its tableau monomial is
\begin{align*}
x^T=x_{a_1}x_{a_2}\cdots x_{a_r},
\end{align*}
hence
\begin{align*}
s_{(1^r)}
&=\sum_{1\le a_1<a_2<\cdots<a_r} x_{a_1}x_{a_2}\cdots x_{a_r}\\
&=e_r,
\end{align*}
by the definition of the elementary symmetric function $e_r$. Thus single-row and single-column Schur functions recover the complete and elementary symmetric functions, respectively.
[/example]
The definition is not visibly symmetric, because the order $1<2<3<\cdots$ is built into the tableau rules. This motivates the first main theorem: despite the ordered alphabet, the generating function belongs to the symmetric world developed earlier.
[quotetheorem:5195]
[citeproof:5195]
[example: Computing Shape Two One in Three Variables]
Let $\lambda=(2,1)$ and restrict entries to $\{1,2,3\}$. A tableau has the form
\begin{align*}
\begin{array}{cc}
a & b\\
c &
\end{array}
\end{align*}
where the row condition gives $a\le b$ and the column condition gives $a<c$. Since $a,c\in\{1,2,3\}$ and $a<c$, the possible pairs $(a,c)$ are
\begin{align*}
(1,2),\quad (1,3),\quad (2,3).
\end{align*}
For each such pair, the entry $b$ must satisfy $a\le b\le 3$. Therefore:
\begin{align*}
(a,c)=(1,2) &: \quad b=1,2,3,\\
(a,c)=(1,3) &: \quad b=1,2,3,\\
(a,c)=(2,3) &: \quad b=2,3.
\end{align*}
Thus the semistandard tableaux are exactly
\begin{align*}
\begin{array}{cc}1&1\\2&\end{array},\quad
\begin{array}{cc}1&2\\2&\end{array},\quad
\begin{array}{cc}1&3\\2&\end{array},\quad
\begin{array}{cc}1&1\\3&\end{array},\quad
\begin{array}{cc}1&2\\3&\end{array},\quad
\begin{array}{cc}1&3\\3&\end{array},\quad
\begin{array}{cc}2&2\\3&\end{array},\quad
\begin{array}{cc}2&3\\3&\end{array}.
\end{align*}
Their weights and tableau monomials are
\begin{align*}
\begin{array}{cc}1&1\\2&\end{array}
&\mapsto (2,1,0) \mapsto x_1^2x_2,\\
\begin{array}{cc}1&2\\2&\end{array}
&\mapsto (1,2,0) \mapsto x_1x_2^2,\\
\begin{array}{cc}1&3\\2&\end{array}
&\mapsto (1,1,1) \mapsto x_1x_2x_3,\\
\begin{array}{cc}1&1\\3&\end{array}
&\mapsto (2,0,1) \mapsto x_1^2x_3,\\
\begin{array}{cc}1&2\\3&\end{array}
&\mapsto (1,1,1) \mapsto x_1x_2x_3,\\
\begin{array}{cc}1&3\\3&\end{array}
&\mapsto (1,0,2) \mapsto x_1x_3^2,\\
\begin{array}{cc}2&2\\3&\end{array}
&\mapsto (0,2,1) \mapsto x_2^2x_3,\\
\begin{array}{cc}2&3\\3&\end{array}
&\mapsto (0,1,2) \mapsto x_2x_3^2.
\end{align*}
Summing one monomial for each tableau gives
\begin{align*}
s_{(2,1)}(x_1,x_2,x_3)
&=x_1^2x_2+x_1x_2^2+x_1x_2x_3+x_1^2x_3+x_1x_2x_3+x_1x_3^2+x_2^2x_3+x_2x_3^2\\
&=x_1^2x_2+x_1^2x_3+x_1x_2^2+2x_1x_2x_3+x_1x_3^2+x_2^2x_3+x_2x_3^2.
\end{align*}
The monomial symmetric function $m_{(2,1)}(x_1,x_2,x_3)$ is the sum of all distinct monomials obtained by permuting the exponents $(2,1,0)$, namely
\begin{align*}
m_{(2,1)}(x_1,x_2,x_3)
&=x_1^2x_2+x_1^2x_3+x_2^2x_1+x_2^2x_3+x_3^2x_1+x_3^2x_2\\
&=x_1^2x_2+x_1^2x_3+x_1x_2^2+x_2^2x_3+x_1x_3^2+x_2x_3^2,
\end{align*}
and
\begin{align*}
m_{(1,1,1)}(x_1,x_2,x_3)=x_1x_2x_3.
\end{align*}
Hence
\begin{align*}
s_{(2,1)}(x_1,x_2,x_3)
&=m_{(2,1)}(x_1,x_2,x_3)+2m_{(1,1,1)}(x_1,x_2,x_3).
\end{align*}
The coefficient $2$ appears because exactly two tableaux of shape $(2,1)$ have weight $(1,1,1)$.
[/example]
This computation displays two central features at once. The coefficients in the monomial expansion count tableaux, and the leading monomial term is determined by the shape itself.
## Kostka Numbers and Dominance Triangularity
How do Schur functions compare with the monomial basis? Since the monomial symmetric function $m_\mu$ collects all distinct monomials whose exponent partition is $\mu$, the coefficient of $m_\mu$ in $s_\lambda$ should count tableaux of shape $\lambda$ with content $\mu$.
[definition: Content of a Tableau]
Let $\lambda$ be a partition, and let $T \in \operatorname{SSYT}(\lambda)$. For a partition $\mu$ of $|\lambda|$, the tableau $T$ has content $\mu$ if the multiset of nonzero parts of $\operatorname{wt}(T)$, arranged in weakly decreasing order, is $\mu$.
[/definition]
Content forgets which label carried which multiplicity, while the coefficient of a particular monomial remembers the labels. This motivates the Kostka number, which fixes the actual weight before symmetry packages all rearrangements into $m_\mu$.
[definition: Kostka Number]
For partitions $\lambda$ and $\mu$ of the same integer, the Kostka number $K_{\lambda\mu}$ is the number of semistandard Young tableaux of shape $\lambda$ and weight $\mu$.
[/definition]
The notation uses weight $\mu$ in the ordered alphabet $1,2,\dots$. This distinction is needed because a tableau expansion of $s_\lambda$ first produces individual monomials with ordered exponent vectors, while $m_\mu$ later groups all monomials whose exponents have partition shape $\mu$. The next structural question is how to pass from that monomial-by-monomial tableau sum to a basis expansion in the $m_\mu$. The answer should identify each coefficient with the appropriate Kostka count, after symmetry has collected the relabelled monomials together.
[quotetheorem:5196]
[citeproof:5196]
The expansion reduces structural questions about Schur functions to enumerative questions about tableaux, but its hypotheses are doing real work. The shape must be a partition because the semistandard row and column rules refer to a Young diagram with left-justified rows of weakly decreasing lengths; for a nonpartition composition shape, the same filling rules do not produce the dominance-triangular change of basis used below. The weight condition also matters: $K_{\lambda\mu}$ counts tableaux of fixed ordered weight $\mu$, while $m_\mu$ then collects all relabellings with exponent partition $\mu$. The theorem does not by itself decide which $K_{\lambda\mu}$ vanish or give a closed formula for the nonzero values; it only says that once the relevant tableaux have been counted, those counts are exactly the monomial coefficients. For instance, the expansion for $s_{(2,1)}$ has coefficient $2$ on $m_{(1,1,1)}$, and the theorem explains this as two tableaux rather than as a consequence of symmetry alone. The next issue is when these numbers vanish and what happens on the diagonal.
[definition: Dominance Order]
Let $\lambda$ and $\mu$ be partitions of $n$. We say that $\lambda$ dominates $\mu$, written $\lambda \unrhd \mu$, if
\begin{align*}
\sum_{i=1}^{k}\lambda_i \ge \sum_{i=1}^{k}\mu_i
\end{align*}
for every $k \ge 1$, where missing parts are treated as $0$.
[/definition]
Dominance compares how much of the diagram is concentrated in the early rows. This motivates the triangularity theorem, because semistandard tableaux force small labels to occupy a controlled initial region of the Young diagram.
[illustration:semistandard-dominance-subdiagram]
The counting problem now becomes a geometric constraint on where the small labels can sit. If a tableau has weight $\mu$, then the first $\mu_1+\cdots+\mu_k$ labels must fit into the first $k$ rows of the shape; comparing this number with $\lambda_1+\cdots+\lambda_k$ gives the dominance inequalities and identifies the diagonal case.
[quotetheorem:5197]
[citeproof:5197]
The diagonal tableau has row $i$ filled entirely with $i$, and its uniqueness is what supplies the diagonal coefficient $1$. The dominance hypothesis is necessary for any tableau to exist: if $\mu$ places too many small labels in the first $k$ weights, column strictness leaves no room for them below row $k$. Dominance is only a vanishing condition, however, not a formula for the remaining Kostka numbers; the example below has $K_{(2,1),(1,1,1)}=2$, while the theorem predicts only that it may be nonzero. This limitation is important because later product rules require richer combinatorics than dominance alone. What unitriangularity gives immediately is the shape of the transition matrix, which is enough for the basis theorem.
[example: Kostka Numbers for Degree Three]
The partitions of $3$ are $(3)$, $(2,1)$, and $(1,1,1)$. Their dominance comparisons are
\begin{align*}
3 &\ge 2, &
3 &\ge 2+1,
\end{align*}
so $(3)\unrhd (2,1)$, and
\begin{align*}
2 &\ge 1, &
2+1 &\ge 1+1, &
2+1+0 &\ge 1+1+1,
\end{align*}
so $(2,1)\unrhd (1,1,1)$. Thus the dominance order in degree $3$ is
\begin{align*}
(3) \unrhd (2,1) \unrhd (1,1,1).
\end{align*}
For shape $(2,1)$ in the alphabet $\{1,2,3\}$, the earlier enumeration gives
\begin{align*}
s_{(2,1)}(x_1,x_2,x_3)
&=x_1^2x_2+x_1^2x_3+x_1x_2^2+2x_1x_2x_3+x_1x_3^2+x_2^2x_3+x_2x_3^2.
\end{align*}
The monomial symmetric functions in degree $3$ with at most three variables include
\begin{align*}
m_{(2,1)}(x_1,x_2,x_3)
&=x_1^2x_2+x_1^2x_3+x_2^2x_1+x_2^2x_3+x_3^2x_1+x_3^2x_2\\
&=x_1^2x_2+x_1^2x_3+x_1x_2^2+x_2^2x_3+x_1x_3^2+x_2x_3^2,
\end{align*}
and
\begin{align*}
m_{(1,1,1)}(x_1,x_2,x_3)=x_1x_2x_3.
\end{align*}
Substituting these two expressions gives
\begin{align*}
m_{(2,1)}(x_1,x_2,x_3)+2m_{(1,1,1)}(x_1,x_2,x_3)
&=x_1^2x_2+x_1^2x_3+x_1x_2^2+x_2^2x_3+x_1x_3^2+x_2x_3^2+2x_1x_2x_3\\
&=s_{(2,1)}(x_1,x_2,x_3).
\end{align*}
Therefore the stable expansion is
\begin{align*}
s_{(2,1)}=m_{(2,1)}+2m_{(1,1,1)}.
\end{align*}
The coefficient of $m_{(2,1)}$ is $1$, so $K_{(2,1),(2,1)}=1$. The coefficient of $m_{(1,1,1)}$ is $2$, so $K_{(2,1),(1,1,1)}=2$. Finally,
\begin{align*}
2 < 3,
\end{align*}
so $(2,1)$ does not dominate $(3)$, and hence $K_{(2,1),(3)}=0$. This degree-three case displays the unitriangular pattern: the diagonal Kostka number is $1$, and the only nonzero lower term for $s_{(2,1)}$ is counted by the two tableaux of weight $(1,1,1)$.
[/example]
This example is small enough to enumerate by hand, but it already shows the unitriangular pattern. The coefficient matrix from Schur functions to monomial symmetric functions is upper triangular with diagonal entries equal to $1$ when partitions are ordered by dominance.
## The Schur Basis of Symmetric Functions
Why does unitriangularity matter algebraically? The monomial symmetric functions form a $\mathbb Z$-basis of $\operatorname{Sym}$, so a unitriangular change of basis over $\mathbb Z$ produces another $\mathbb Z$-basis.
[quotetheorem:5182]
[citeproof:5182]
The theorem is the main algebraic payoff of the chapter. The unit diagonal over $\mathbb Z$ is essential: it says the transition matrix has determinant $1$ degree by degree, so the inverse transition matrix again has integer entries. A merely triangular matrix with diagonal entries different from $\pm 1$ would not give a $\mathbb Z$-basis; for example, replacing a basis vector $m_\lambda$ by $2m_\lambda$ still gives a triangular transition matrix over $\mathbb Q$, but it cannot span $m_\lambda$ over $\mathbb Z$. Thus the result is stronger than linear independence over a field: it identifies Schur functions as an integral basis of the symmetric-function ring. This integrality is what lets later structure constants, such as Littlewood-Richardson coefficients, be interpreted as integers counted by combinatorial objects.
[remark: Positivity as Enumeration]
The coefficient $K_{\lambda\mu}$ in the expansion of $s_\lambda$ into the monomial basis is not merely nonnegative; it is the cardinality of a concrete set of tableaux. This is the first instance of a recurring theme in algebraic combinatorics: a structural coefficient becomes understandable once it is realized as the size of a natural combinatorial family.
[/remark]
The Schur basis will interact with generating functions through the Cauchy identity in Chapter 6 and with multiplication through the Pieri rules in Chapter 7 and the Littlewood-Richardson rule in Chapter 8. It also connects this combinatorial course to representation theory: $s_\lambda$ is the character of an irreducible polynomial representation of the general linear group, and Kostka numbers record weight multiplicities in that representation. In geometry, the same basis appears through Schubert classes on Grassmannians, where Littlewood-Richardson coefficients become intersection numbers. The tableau definition developed here is the common language for all of these directions: it keeps track of weights for symmetric functions and shapes for combinatorial operations on diagrams.
The tableau definition gives Schur functions their combinatorial meaning, but determinant formulae reveal their algebraic structure. Chapter 5 turns to Jacobi-Trudi and related identities, showing how the tableau basis can also be recovered from linear-algebraic expressions built from the classical symmetric functions.
# 5. Determinantal Formulae
This chapter turns the tableau definition of Schur polynomials into determinant formulae. We assume the preceding material on partitions and the ring $\operatorname{Sym}$ from Chapter 1, the complete homogeneous and elementary symmetric functions $h_r$ and $e_r$ from Chapter 2, and the semistandard Young tableaux and Schur functions from Chapter 4. The guiding question is how a symmetric polynomial with a combinatorial definition can be recovered from linear algebra: first through alternating polynomials and Vandermonde determinants, then through determinants built from the complete homogeneous and elementary symmetric functions. These formulae are computational tools, but they also explain why Schur functions behave like the central basis of the ring of symmetric functions, appearing both as tableau generating functions and as characters in representation-theoretic settings.
## Alternants and the Bialternant Formula
The first problem is to understand how antisymmetry can force divisibility. If a polynomial changes sign whenever two variables are interchanged, then it must vanish when those two variables are equal, so each difference $x_i-x_j$ should divide it. The product of all such differences is the Vandermonde determinant, and it is the universal alternating factor.
[definition: Alternant]
Let $n \in \mathbb N$, let $\alpha=(\alpha_1,\dots,\alpha_n)$ be a sequence of nonnegative integers, and let $x=(x_1,\dots,x_n)$. The alternant attached to $\alpha$ is
\begin{align*}
a_\alpha(x)=\det(x_i^{\alpha_j})_{1\le i,j\le n}.
\end{align*}
It is an element of $\mathbb Z[x_1,\dots,x_n]$.
[/definition]
Alternants are determinants whose columns are monomial columns. Swapping two variables swaps two rows, so the determinant changes sign. If two exponents coincide, then two columns coincide and the alternant is zero; hence the most useful alternants come from strictly decreasing exponent sequences.
A boundary case already shows why the exponent sequence matters. For $n=3$ and $\alpha=(2,2,0)$, the first two columns agree, so $a_\alpha=0$ and no Schur function can be recovered by dividing by the Vandermonde. The staircase shift used below is precisely what prevents this collapse for a partition $\lambda$.
[example: Three-Variable Alternant]
For $n=3$ and $\alpha=(4,2,0)$, the alternant is
\begin{align*}
a_{(4,2,0)}(x_1,x_2,x_3)
&=\det\begin{pmatrix}
x_1^4&x_1^2&1\\
x_2^4&x_2^2&1\\
x_3^4&x_3^2&1
\end{pmatrix}.
\end{align*}
Setting $y_i=x_i^2$, this determinant is the three-variable Vandermonde determinant in the variables $y_1,y_2,y_3$:
\begin{align*}
a_{(4,2,0)}(x_1,x_2,x_3)
&=\det\begin{pmatrix}
y_1^2&y_1&1\\
y_2^2&y_2&1\\
y_3^2&y_3&1
\end{pmatrix}\\
&=(y_1-y_2)(y_1-y_3)(y_2-y_3)\\
&=(x_1^2-x_2^2)(x_1^2-x_3^2)(x_2^2-x_3^2)\\
&=(x_1-x_2)(x_1+x_2)(x_1-x_3)(x_1+x_3)(x_2-x_3)(x_2+x_3).
\end{align*}
Thus the Vandermonde factor in three variables divides the alternant, and the quotient is
\begin{align*}
\frac{a_{(4,2,0)}(x_1,x_2,x_3)}{(x_1-x_2)(x_1-x_3)(x_2-x_3)}
&=(x_1+x_2)(x_1+x_3)(x_2+x_3)\\
&=(x_1^2+x_1x_3+x_1x_2+x_2x_3)(x_2+x_3)\\
&=x_1^2x_2+x_1^2x_3+x_1x_2^2+2x_1x_2x_3+x_1x_3^2+x_2^2x_3+x_2x_3^2.
\end{align*}
This quotient is symmetric and has total degree $3$. Since for $\lambda=(2,1,0)$ and $\delta=(2,1,0)$ one has $\lambda+\delta=(4,2,0)$, the bialternant formula below will identify this quotient with $s_{(2,1)}(x_1,x_2,x_3)$.
[/example]
This example raises the need for a canonical alternating divisor. To make the quotient of an alternant into a symmetric function, we first isolate the alternating polynomial with the smallest strictly decreasing exponent sequence.
[definition: Vandermonde Alternant]
For $n\in \mathbb N$, set
\begin{align*}
\delta=(n-1,n-2,\dots,1,0).
\end{align*}
The Vandermonde alternant is the polynomial $a_\delta(x_1,\dots,x_n)\in \mathbb Z[x_1,\dots,x_n]$.
[/definition]
The notation $\delta$ records the staircase partition. Its determinant has a closed product form, and this product form is what makes division by $a_\delta$ meaningful inside the polynomial ring.
[quotetheorem:1673]
[citeproof:1673]
The strict inequalities $i<j$ in the product matter: each unordered pair of variables contributes exactly one linear factor, so using both $x_i-x_j$ and $x_j-x_i$ would double-count the vanishing along the same diagonal. The chosen column order also fixes the sign. If the determinant were written with columns $1,x_i,\dots,x_i^{n-1}$ instead of $x_i^{n-1},\dots,1$, the product would be $\prod_{i<j}(x_j-x_i)$ rather than the displayed product. The staircase exponents are also essential: for $n=2$ and $\alpha=(1,1)$, the alternant
\begin{align*}
\det\begin{pmatrix}x_1&x_1\\ x_2&x_2\end{pmatrix}=0
\end{align*}
does not recover $x_1-x_2$, because repeated exponents make two columns identical. The theorem does not say that every alternating polynomial is equal to the Vandermonde; it says that the Vandermonde is the minimal alternating factor, and any alternating polynomial is a symmetric polynomial times it. For example, in three variables the polynomial $(x_1+x_2+x_3)a_\delta$ is alternating of degree $4$, so omitting the symmetric factor would give the wrong alternating polynomial even though the same vanishing diagonals occur. Likewise $a_{(4,2,0)}$ has degree $6$, while the Vandermonde has degree $3$, so the quotient must carry the remaining symmetric degree $3$. This divisibility statement is the input that makes the bialternant quotient a polynomial rather than only a rational expression.
The remaining difficulty is to put a partition into an alternant without losing the strict decrease needed for a nonzero determinant, while still recovering the Schur polynomial defined by tableaux in Chapter 4. The raw exponent sequence $\lambda$ may have repeated parts, so $a_\lambda$ can vanish even when the Schur polynomial is nonzero. The staircase $\delta$ separates the exponents just enough to make the determinant alternating, while still remembering the row lengths of $\lambda$ after the Vandermonde factor is removed.
[definition: Bialternant Schur Polynomial]
Let $\lambda=(\lambda_1,\dots,\lambda_n)$ be a partition with at most $n$ parts, allowing trailing zeroes. The bialternant expression attached to $\lambda$ is
\begin{align*}
\frac{a_{\lambda+\delta}(x_1,\dots,x_n)}{a_\delta(x_1,\dots,x_n)},
\end{align*}
where
\begin{align*}
\lambda+\delta=(\lambda_1+n-1,\lambda_2+n-2,\dots,\lambda_n).
\end{align*}
It is initially regarded as an element of the fraction field $\operatorname{Frac}(\mathbb Z[x_1,\dots,x_n])$.
[/definition]
This quotient is initially only a rational expression. Since the numerator is alternating, the Vandermonde divisibility result shows that the quotient is a polynomial; because both numerator and denominator change sign under every transposition, the quotient is symmetric. The next result identifies this symmetric polynomial with the Schur polynomial previously defined by semistandard Young tableaux.
[quotetheorem:5198]
[citeproof:5198]
The condition that $\lambda$ have at most $n$ parts is necessary in $n$ variables: a column of height greater than $n$ cannot be filled strictly with entries from $\{1,\dots,n\}$. For example, with $n=2$ and $\lambda=(1,1,1)$, every tableau would need a column of height $3$ using only the entries $1$ and $2$, so $s_{(1,1,1)}(x_1,x_2)=0$. On the determinant side, forcing this shape into two variables loses the third part; if instead one pads the alternant data in three rows but evaluates in only two variables, two determinant rows correspond to the same variable and the numerator alternant is zero. This is the same obstruction expressed linearly: there are not enough distinct variable rows to support a strictly decreasing shifted exponent pattern of length $3$. The formula does not by itself give a stable expression in $\operatorname{Sym}$, because both numerator and denominator depend on the chosen number of variables. Its strength is instead finite-variable control: it makes symmetry, divisibility by the Vandermonde, and leading monomials visible at once. The next determinant formula removes the Vandermonde quotient by replacing the finite-variable alternant argument with row generating functions.
[example: Computing $s_{(3,2)}$ From the Bialternant]
Take $\lambda=(3,2)$ and work in two variables. Then $\delta=(1,0)$, so $\lambda+\delta=(4,2)$, and the *Bialternant Formula* gives
\begin{align*}
s_{(3,2)}(x_1,x_2)
&=\frac{a_{(4,2)}(x_1,x_2)}{a_{(1,0)}(x_1,x_2)}\\
&=\frac{\det\begin{pmatrix}x_1^4&x_1^2\\ x_2^4&x_2^2\end{pmatrix}}{\det\begin{pmatrix}x_1&1\\ x_2&1\end{pmatrix}}\\
&=\frac{x_1^4x_2^2-x_2^4x_1^2}{x_1-x_2}\\
&=\frac{x_1^2x_2^2(x_1^2-x_2^2)}{x_1-x_2}\\
&=\frac{x_1^2x_2^2(x_1-x_2)(x_1+x_2)}{x_1-x_2}\\
&=x_1^2x_2^2(x_1+x_2)\\
&=x_1^3x_2^2+x_1^2x_2^3.
\end{align*}
The tableau expansion gives the same two monomials. In shape $(3,2)$ with entries from $\{1,2\}$, the two boxes in the second row lie below the first two boxes of the first row, so strict column increase forces both second-row entries to be $2$. The entries above them must then both be $1$, and the remaining third box in the first row may be either $1$ or $2$ because rows are weakly increasing. Thus the only possible first rows are $(1,1,1)$ and $(1,1,2)$, giving weights $x_1^3x_2^2$ and $x_1^2x_2^3$, exactly the two terms obtained from the bialternant quotient.
[/example]
## Complete Homogeneous Determinants
The bialternant formula is powerful, but it depends on a chosen number of variables and on a quotient by the Vandermonde. The next problem is to express $s_\lambda$ directly in the stable ring $\operatorname{Sym}$ using one of the classical bases already developed. The complete homogeneous functions are natural because they encode weakly increasing rows, which is the row condition in semistandard tableaux.
[definition: Jacobi-Trudi Matrix]
Let $\lambda=(\lambda_1,\dots,\lambda_\ell)$ be a partition of length at most $\ell$. The Jacobi-Trudi matrix of shape $\lambda$ is the $\ell\times \ell$ matrix
\begin{align*}
\bigl(h_{\lambda_i-i+j}\bigr)_{1\le i,j\le \ell},
\end{align*}
viewed as an element of $\operatorname{Mat}_{\ell}(\operatorname{Sym})$,
with the conventions $h_0=1$ and $h_r=0$ for $r<0$.
[/definition]
The matrix packages the row data of $\lambda$ into complete homogeneous functions, but the definition alone only gives a candidate determinant in $\operatorname{Sym}$. The obstacle is that $h_{\lambda_i}$ counts row $i$ as an independent weakly increasing row, while a semistandard tableau also requires strict increase down each column. Thus the next question is whether the shifts $-i+j$ provide exactly the signed correction needed to remove the fillings that violate the column condition. The Jacobi-Trudi identity is needed to answer that question: it turns an apparently rowwise determinant into the tableau generating function itself, so the row-generating functions recover the same Schur function that was first defined by tableaux.
This correction is already visible in shape $(2,1)$. Choosing a weak row of length $2$ and then a weak row of length $1$ also counts fillings where the lower entry is less than or equal to the upper entry in the first column, which violates semistandard column strictness. The next theorem is needed because the Jacobi-Trudi determinant subtracts precisely these bad-row configurations through its signed shifted row lengths, leaving the Schur function and no extra terms.
[quotetheorem:5181]
[citeproof:5181]
The conventions $h_0=1$ and $h_r=0$ for $r<0$ are essential hypotheses of the formula, not cosmetic choices. A concrete failure appears when one pads a partition by extra zero rows. For $\lambda=(1)$ computed with $\ell=2$, the Jacobi-Trudi determinant is
\begin{align*}
\det\begin{pmatrix}h_1&h_2\\ h_{-1}&h_0\end{pmatrix}.
\end{align*}
With the stated conventions this gives
\begin{align*}
\det\begin{pmatrix}h_1&h_2\\ h_{-1}&h_0\end{pmatrix}=h_1.
\end{align*}
If $h_{-1}$ were assigned any nonzero value instead of $0$, the determinant would become $h_1-h_2h_{-1}$ and would no longer equal $s_{(1)}=h_1$. Thus the zero convention for negative indices is what makes extra padded rows harmless. In finitely many variables, the identity remains valid after specialising $h_r$ to those variables, but Schur polynomials with more than $n$ rows vanish because the tableau column condition cannot be met; for instance $s_{(1,1,1)}(x_1,x_2)=0$ although the stable symmetric function $s_{(1,1,1)}=e_3$ is nonzero. The determinant size is also not always computationally efficient: a partition with many rows gives a large Jacobi-Trudi determinant even when its conjugate has few columns. This is why the dual Jacobi-Trudi formula below is not merely a symmetric curiosity but a practical alternative.
[example: Computing $s_{(3,2)}$ By Jacobi-Trudi]
For $\lambda=(3,2)$, the Jacobi-Trudi matrix has size $2$, and its entries are
\begin{align*}
h_{\lambda_i-i+j}
&=\begin{pmatrix}
h_{3-1+1}&h_{3-1+2}\\
h_{2-2+1}&h_{2-2+2}
\end{pmatrix}
=\begin{pmatrix}
h_3&h_4\\
h_1&h_2
\end{pmatrix}.
\end{align*}
Thus
\begin{align*}
s_{(3,2)}
&=\det\begin{pmatrix}h_3&h_4\\ h_1&h_2\end{pmatrix}\\
&=h_3h_2-h_4h_1.
\end{align*}
Since $h_r$ is the sum of all monomial symmetric functions of degree $r$, we have
\begin{align*}
h_3&=m_{(3)}+m_{(2,1)}+m_{(1,1,1)},\\
h_2&=m_{(2)}+m_{(1,1)},\\
h_4&=m_{(4)}+m_{(3,1)}+m_{(2,2)}+m_{(2,1,1)}+m_{(1,1,1,1)},\\
h_1&=m_{(1)}.
\end{align*}
For a fixed monomial of exponent partition $\gamma\vdash 5$, the coefficient of $m_\gamma$ in $h_3h_2$ is the number of ways to choose a degree-$2$ exponent vector bounded componentwise by $\gamma$; the coefficient in $h_4h_1$ is the number of positive parts of $\gamma$. This gives
\begin{align*}
h_3h_2
&=m_{(5)}+2m_{(4,1)}+3m_{(3,2)}+4m_{(3,1,1)}
+5m_{(2,2,1)}\\
&\qquad +7m_{(2,1,1,1)}+10m_{(1,1,1,1,1)},\\
h_4h_1
&=m_{(5)}+2m_{(4,1)}+2m_{(3,2)}+3m_{(3,1,1)}
+3m_{(2,2,1)}\\
&\qquad +4m_{(2,1,1,1)}+5m_{(1,1,1,1,1)}.
\end{align*}
Subtracting term by term,
\begin{align*}
s_{(3,2)}
&=h_3h_2-h_4h_1\\
&=m_{(3,2)}+m_{(3,1,1)}+2m_{(2,2,1)}
+3m_{(2,1,1,1)}+5m_{(1,1,1,1,1)}.
\end{align*}
The determinant cancels the terms of types $(5)$ and $(4,1)$, which correspond to row fillings that cannot satisfy the two strict column increases in shape $(3,2)$. For example, the coefficient $2$ of $m_{(2,2,1)}$ records the two semistandard tableaux of shape $(3,2)$ with content $(2,2,1)$, up to relabelling of the entries.
[/example]
The determinant also handles shapes with more rows, where the tableau expansion becomes less compact. The convention $h_r=0$ for negative $r$ is part of the formula rather than an afterthought: it makes impossible shifted rows contribute zero.
[example: Computing $s_{(2,1,1)}$ By Jacobi-Trudi]
For $\lambda=(2,1,1)$, take $\ell=3$. By the *Jacobi-Trudi Identity*,
\begin{align*}
s_{(2,1,1)}
&=\det\begin{pmatrix}
h_2&h_3&h_4\\
h_0&h_1&h_2\\
h_{-1}&h_0&h_1
\end{pmatrix}\\
&=\det\begin{pmatrix}
h_2&h_3&h_4\\
1&h_1&h_2\\
0&1&h_1
\end{pmatrix},
\end{align*}
using the conventions $h_0=1$ and $h_r=0$ for $r<0$. Expanding the determinant along the first row gives
\begin{align*}
s_{(2,1,1)}
&=h_2\det\begin{pmatrix}h_1&h_2\\ 1&h_1\end{pmatrix}
-h_3\det\begin{pmatrix}1&h_2\\ 0&h_1\end{pmatrix}
+h_4\det\begin{pmatrix}1&h_1\\ 0&1\end{pmatrix}\\
&=h_2(h_1^2-h_2)-h_3(h_1-0)+h_4(1-0)\\
&=h_2(h_1^2-h_2)-h_3h_1+h_4.
\end{align*}
Now
\begin{align*}
h_1^2
&=\left(\sum_i x_i\right)\left(\sum_j x_j\right)\\
&=\sum_i x_i^2+2\sum_{i<j}x_ix_j\\
&=m_{(2)}+2m_{(1,1)},
\end{align*}
so
\begin{align*}
h_1^2-h_2
&=\bigl(m_{(2)}+2m_{(1,1)}\bigr)-\bigl(m_{(2)}+m_{(1,1)}\bigr)\\
&=m_{(1,1)}.
\end{align*}
Therefore
\begin{align*}
h_2(h_1^2-h_2)
&=(m_{(2)}+m_{(1,1)})m_{(1,1)}\\
&=\bigl(m_{(3,1)}+m_{(2,1,1)}\bigr)
+\bigl(m_{(2,2)}+2m_{(2,1,1)}+6m_{(1,1,1,1)}\bigr)\\
&=m_{(3,1)}+m_{(2,2)}+3m_{(2,1,1)}+6m_{(1,1,1,1)}.
\end{align*}
Also
\begin{align*}
h_3h_1
&=(m_{(3)}+m_{(2,1)}+m_{(1,1,1)})m_{(1)}\\
&=\bigl(m_{(4)}+m_{(3,1)}\bigr)
+\bigl(m_{(3,1)}+2m_{(2,2)}+2m_{(2,1,1)}\bigr)
+\bigl(m_{(2,1,1)}+4m_{(1,1,1,1)}\bigr)\\
&=m_{(4)}+2m_{(3,1)}+2m_{(2,2)}+3m_{(2,1,1)}+4m_{(1,1,1,1)},
\end{align*}
and
\begin{align*}
h_4=m_{(4)}+m_{(3,1)}+m_{(2,2)}+m_{(2,1,1)}+m_{(1,1,1,1)}.
\end{align*}
Substituting these three expansions into the determinant expression,
\begin{align*}
s_{(2,1,1)}
&=\bigl(m_{(3,1)}+m_{(2,2)}+3m_{(2,1,1)}+6m_{(1,1,1,1)}\bigr)\\
&\qquad-\bigl(m_{(4)}+2m_{(3,1)}+2m_{(2,2)}+3m_{(2,1,1)}+4m_{(1,1,1,1)}\bigr)\\
&\qquad+\bigl(m_{(4)}+m_{(3,1)}+m_{(2,2)}+m_{(2,1,1)}+m_{(1,1,1,1)}\bigr)\\
&=m_{(2,1,1)}+3m_{(1,1,1,1)}.
\end{align*}
The tableau count gives the same coefficients. In shape $(2,1,1)$, the first column has height $3$, so its entries must be three distinct entries in increasing order. For content type $(2,1,1)$, the repeated entry must occupy the extra box in the first row, giving one tableau up to relabelling. For content type $(1,1,1,1)$, the top-left box must contain the smallest entry, and the extra first-row box may contain any one of the remaining three entries; the two unused entries then fill the lower boxes of the first column in increasing order. Thus the tableau expansion also gives $m_{(2,1,1)}+3m_{(1,1,1,1)}$.
[/example]
## Elementary Determinants and Conjugate Shapes
The complete homogeneous formula is row-based. The final problem in this chapter is to obtain the column-based analogue. Since elementary symmetric functions encode strictly increasing selections, they match columns in the same way that complete homogeneous functions match rows.
[definition: Conjugate Partition]
Conjugation is the map from the set of partitions to itself sending a partition $\lambda$ to the partition $\lambda'$. For a partition $\lambda$, its conjugate partition $\lambda'$ is defined by
\begin{align*}
\lambda'_j=|\{i:\lambda_i\ge j\}|.
\end{align*}
[/definition]
Conjugation interchanges rows and columns of the Young diagram. This motivates the dual determinant: the indexing should use $\lambda'$, and the entries should be elementary symmetric functions because they generate strictly increasing column choices.
[definition: Dual Jacobi-Trudi Matrix]
Let $\lambda$ be a partition and let $m\ge \lambda_1$. The dual Jacobi-Trudi matrix of shape $\lambda$ is the $m\times m$ matrix
\begin{align*}
\bigl(e_{\lambda'_i-i+j}\bigr)_{1\le i,j\le m},
\end{align*}
viewed as an element of $\operatorname{Mat}_{m}(\operatorname{Sym})$,
with the conventions $e_0=1$ and $e_r=0$ for $r<0$.
[/definition]
[illustration:conjugate-determinant-data]
The shape parameter is now controlled by the number of columns. The formula is dual because it replaces weak row generating functions by strict column generating functions.
[quotetheorem:5199]
[citeproof:5199]
The lower bound $m\ge \lambda_1$ ensures that the determinant has enough rows and columns to see every column of the Young diagram. If too few columns are used, the determinant can compute the wrong function: for $\lambda=(2)$, taking $m=1$ gives the $1\times 1$ determinant $e_{\lambda'_1}=e_1$, while the correct Schur function is $s_{(2)}=h_2$. Increasing $m$ beyond $\lambda_1$ does not change the value, because the added rows and columns contribute an identity-like tail through $e_0=1$ and $e_r=0$ for $r<0$. The formula does not replace the need for the complete homogeneous version: for shapes with many columns and few rows, the dual determinant may be larger than Jacobi-Trudi. Its role is to expose the column version of the same inclusion-exclusion and the involution exchanging $h_r$ with $e_r$.
[example: Computing $s_{(2,1,1)}$ By Dual Jacobi-Trudi]
For $\lambda=(2,1,1)$, the conjugate partition is $\lambda'=(3,1)$, because the first column has height $3$ and the second column has height $1$. Since $\lambda_1=2$, the dual Jacobi-Trudi determinant has size $2$, and by the *Dual Jacobi-Trudi Identity* its entries are $e_{\lambda'_i-i+j}$:
\begin{align*}
s_{(2,1,1)}
&=\det\begin{pmatrix}
e_{\lambda'_1-1+1}&e_{\lambda'_1-1+2}\\
e_{\lambda'_2-2+1}&e_{\lambda'_2-2+2}
\end{pmatrix}\\
&=\det\begin{pmatrix}
e_{3-1+1}&e_{3-1+2}\\
e_{1-2+1}&e_{1-2+2}
\end{pmatrix}\\
&=\det\begin{pmatrix}
e_3&e_4\\
e_0&e_1
\end{pmatrix}\\
&=\det\begin{pmatrix}
e_3&e_4\\
1&e_1
\end{pmatrix}\\
&=e_3e_1-e_4.
\end{align*}
To expand this in the monomial basis, use $e_3=m_{(1,1,1)}$, $e_1=m_{(1)}$, and $e_4=m_{(1,1,1,1)}$. Thus
\begin{align*}
e_3e_1
&=\left(\sum_{i<j<k}x_ix_jx_k\right)\left(\sum_\ell x_\ell\right).
\end{align*}
For a fixed triple $i<j<k$, multiplying by $x_i$, $x_j$, or $x_k$ gives monomials of type $(2,1,1)$:
\begin{align*}
x_ix_jx_k(x_i+x_j+x_k)
&=x_i^2x_jx_k+x_ix_j^2x_k+x_ix_jx_k^2.
\end{align*}
Each monomial of type $(2,1,1)$ arises exactly once in this way, by taking its three variables as the triple and choosing the repeated variable from $e_1$. If $\ell$ is distinct from $i,j,k$, then $x_ix_jx_kx_\ell$ has type $(1,1,1,1)$; for any fixed set of four distinct variables, there are four choices for which one is supplied by $e_1$. Therefore
\begin{align*}
e_3e_1
&=m_{(2,1,1)}+4m_{(1,1,1,1)}.
\end{align*}
Subtracting $e_4=m_{(1,1,1,1)}$ gives
\begin{align*}
s_{(2,1,1)}
&=e_3e_1-e_4\\
&=\bigl(m_{(2,1,1)}+4m_{(1,1,1,1)}\bigr)-m_{(1,1,1,1)}\\
&=m_{(2,1,1)}+3m_{(1,1,1,1)}.
\end{align*}
The column-based determinant therefore gives the same symmetric function obtained earlier from the row-based Jacobi-Trudi computation and from the semistandard tableau count.
[/example]
## Comparing the Three Determinantal Formulae
The remaining problem is not to find another formula, but to decide which determinant should be used in a given argument. The three formulae have the same value, yet they expose different structure: the bialternant formula is closest to linear algebra in finitely many variables, Jacobi-Trudi is closest to row generating functions, and dual Jacobi-Trudi is closest to column generating functions. Their agreement is a useful consistency check on the theory: tableaux, alternants, and the $h$/$e$ bases all encode the same symmetric functions.
[remark: Choice of Formula]
For explicit computations, the shortest determinant usually comes from the smaller of the number of rows and the number of columns. For structural arguments in $\operatorname{Sym}$, Jacobi-Trudi and dual Jacobi-Trudi are often preferable because they avoid choosing a finite variable set. For arguments involving divisibility, leading terms, or specialisations in $n$ variables, the bialternant formula is usually the natural form.
[/remark]
These formulae prepare the ground for the Cauchy identities and product rules. In later chapters, determinants will be replaced by generating functions and tableaux rules, but the identities proved here remain the algebraic backbone behind those expansions.
Once Schur functions have determinant descriptions, they can be assembled into generating kernels that encode whole families of identities at once. Chapter 6 uses this perspective to derive Cauchy-type formulas and to connect the Hall inner product with product expansions in a uniform way.
# 6. Cauchy Identities and Kernels
This chapter turns the bases of symmetric functions into a machine for producing identities. Building on the Hall inner product of Chapter 3 and the Schur basis of Chapter 4, the central object is the Cauchy kernel, a generating series in two independent alphabets that records the Hall inner product and makes duality visible. We use it first to relate the complete homogeneous and monomial bases, then to derive the Schur Cauchy identity, and finally to see how the omega involution converts it into the dual Cauchy identity.
## The Cauchy Kernel Over Two Alphabets
What generating series should encode all complete homogeneous symmetric functions at once, while also remembering a second set of variables for coefficient extraction? A single alphabet generating function $\sum_{n\ge 0}h_n(x)t^n$ records only one exponent at a time, so it cannot distinguish the different parts of a partition or support coefficient extraction against another basis. The remedy is to use two alphabets $x=(x_1,x_2,\dots)$ and $y=(y_1,y_2,\dots)$ and form a product over all pairs of variables.
[definition: Cauchy Kernel]
Let $x=(x_1,x_2,\dots)$ and $y=(y_1,y_2,\dots)$ be two alphabets. The Cauchy kernel is the formal power series
\begin{align*}
\Omega(x,y) := \prod_{i,j\ge 1} (1-x_i y_j)^{-1}.
\end{align*}
[/definition]
The product is interpreted degree by degree in the completed tensor product $\widehat{\operatorname{Sym}} \widehat{\otimes} \widehat{\operatorname{Sym}}$. In any fixed total degree only finitely many partitions contribute, so identities involving $\Omega(x,y)$ may be checked after restricting to finitely many variables and then passing to the stable limit.
[example: Two Variable Expansion]
Take $x=(x_1,x_2)$ and $y=(y_1,y_2)$. For each fixed $j$, expanding the two geometric series in powers of $y_j$ gives
\begin{align*}
\prod_{i=1}^{2}(1-x_i y_j)^{-1}
&=(1+x_1y_j+x_1^2y_j^2+\cdots)(1+x_2y_j+x_2^2y_j^2+\cdots)\\
&=1+(x_1+x_2)y_j+(x_1^2+x_1x_2+x_2^2)y_j^2+\cdots\\
&=1+h_1(x)y_j+h_2(x)y_j^2+\cdots .
\end{align*}
Thus, keeping only terms of total degree at most $2$ in $y_1,y_2$,
\begin{align*}
\prod_{i=1}^{2}\prod_{j=1}^{2}(1-x_i y_j)^{-1}
&=\bigl(1+h_1(x)y_1+h_2(x)y_1^2+\cdots\bigr)
\bigl(1+h_1(x)y_2+h_2(x)y_2^2+\cdots\bigr)\\
&=1+h_1(x)(y_1+y_2)+h_2(x)(y_1^2+y_2^2)+h_1(x)^2y_1y_2+\cdots\\
&=1+h_1(x)h_1(y)+h_2(x)m_2(y)+h_{11}(x)m_{11}(y)+\cdots ,
\end{align*}
where $h_1(y)=y_1+y_2$, $m_2(y)=y_1^2+y_2^2$, $m_{11}(y)=y_1y_2$, and $h_{11}(x)=h_1(x)^2$.
The two degree-$2$ coefficient extractions are therefore
\begin{align*}
[y_1y_2]\,\prod_{i=1}^{2}\prod_{j=1}^{2}(1-x_i y_j)^{-1}
&=h_1(x)^2=h_{11}(x),\\
[y_1^2]\,\prod_{i=1}^{2}\prod_{j=1}^{2}(1-x_i y_j)^{-1}
&=h_2(x).
\end{align*}
So the exponent pattern $(1,1)$ in the $y$ variables selects $h_{11}(x)$, while the exponent pattern $(2)$ selects $h_2(x)$.
[/example]
The example suggests that the kernel is not just a product: it is the universal pairing between two natural bases. Small expansions are unreliable as a method because they hide how repeated exponents are grouped into monomial symmetric functions and give no uniform rule for larger partitions. To make coefficient extraction systematic rather than dependent on inspection, we need a degree-by-degree identity for all partitions.
[quotetheorem:5200]
[citeproof:5200]
This theorem is the complete-monomial form of the Cauchy identity, matching the duality proved in Chapter 3: it says that $\{h_\lambda\}$ and $\{m_\lambda\}$ are dual bases for the Hall inner product. The completion is necessary because the displayed sum ranges over partitions of all sizes; without working degree by degree, the expression is not an element of the ordinary tensor product. The homogeneous finite-degree interpretation is also what makes the proof legitimate: in degree $N$ only partitions of $N$ contribute, and finite-variable truncations stabilize. The theorem does not by itself give a symmetric expansion in the same basis on both sides, so the next section asks which bases make the kernel diagonal.
## Dual Bases And The Schur Cauchy Identity
If the kernel records dual bases, what happens when the basis is self-dual? The complete and monomial bases are useful for extraction, but they treat the two alphabets asymmetrically and do not expose the tableau structure developed earlier in the course. Schur functions have exactly the missing role for the Hall inner product, and the kernel therefore becomes a diagonal sum in the Schur basis.
[remark: Hall Pairing In Complete-Monomial Coordinates]
Recall the Hall inner product on $\operatorname{Sym}$ over $\mathbb Q$ from Chapter 3. By complete-monomial duality, it is equivalently characterized on the complete and monomial bases by
\begin{align*}
(h_\lambda,m_\mu)=\delta_{\lambda\mu}
\end{align*}
for all partitions $\lambda,\mu$.
[/remark]
The preceding kernel expansion is the generating series for this pairing. A naive substitution of a new basis into $\sum_\lambda h_\lambda(x)m_\lambda(y)$ would generally introduce off-diagonal terms, so the tensor is useful only if it is independent of the particular dual basis used to write it. To use a basis better adapted to tableaux, we need the change-of-basis principle behind the kernel.
[quotetheorem:5201]
[citeproof:5201]
This abstract form is useful because the Schur functions were already constructed through tableaux in Chapter 4 and will now give the diagonal form of the same kernel. The homogeneous, dual-basis hypotheses also explain the limits of how the identity should be used: the kernel is read degree by degree, and a diagonal expansion is available only for a basis paired with its Hall-dual basis. The Schur version is the course's central identity.
[quotetheorem:5202]
[citeproof:5202]
[remark: Schur Self-Duality From The Cauchy Kernel]
The Hall pairing is already fixed by the complete-monomial duality in Chapter 3. The Schur Cauchy identity writes the same reproducing kernel in the diagonal form
\begin{align*}
\Omega(x,y)=\sum_\lambda s_\lambda(x)s_\lambda(y).
\end{align*}
Since coefficients in the reproducing kernel record the basis dual to the left-hand basis, this diagonal expansion says that the Schur basis is self-dual for the Hall pairing:
\begin{align*}
\langle s_\lambda,s_\mu\rangle=\delta_{\lambda\mu}.
\end{align*}
This is the form of Schur orthogonality used in the later Pieri, skewing, and Littlewood-Richardson chapters.
[/remark]
The identity is powerful because the product side is suited to elementary enumeration, while the Schur side packages the same data by partitions and tableaux. Its infinite product and infinite sum are formal completed expressions, so coefficient extraction must be read degreewise, or after restricting to finitely many variables and then passing to the stable limit. The identity does not assert analytic convergence of the product, nor does it count objects until a specific coefficient is chosen. Coefficient extraction is the bridge between these two readings.
[example: Matrices With Fixed Margins]
Fix non-negative integer vectors $r=(r_1,\dots,r_a)$ and $c=(c_1,\dots,c_b)$ with $\sum_i r_i=\sum_j c_j=N$. We show that the coefficient of $x_1^{r_1}\cdots x_a^{r_a}y_1^{c_1}\cdots y_b^{c_b}$ in
\begin{align*}
\prod_{i=1}^{a}\prod_{j=1}^{b}(1-x_i y_j)^{-1}
\end{align*}
counts $a\times b$ matrices $A=(A_{ij})$ with entries in $\mathbb N\cup\{0\}$, row sums $r_i$, and column sums $c_j$.
For each pair $(i,j)$, the geometric expansion is
\begin{align*}
(1-x_i y_j)^{-1}
&=\sum_{n\ge 0}(x_i y_j)^n \\
&=\sum_{n\ge 0}x_i^n y_j^n .
\end{align*}
Multiplying these expansions over all pairs $(i,j)$ gives
\begin{align*}
\prod_{i=1}^{a}\prod_{j=1}^{b}(1-x_i y_j)^{-1}
&=\prod_{i=1}^{a}\prod_{j=1}^{b}\left(\sum_{A_{ij}\ge 0}x_i^{A_{ij}}y_j^{A_{ij}}\right) \\
&=\sum_{A_{11}\ge 0}\cdots\sum_{A_{ab}\ge 0}
\prod_{i=1}^{a}\prod_{j=1}^{b}x_i^{A_{ij}}y_j^{A_{ij}} \\
&=\sum_{A\in(\mathbb N\cup\{0\})^{a\times b}}
x_1^{A_{11}+\cdots+A_{1b}}\cdots x_a^{A_{a1}+\cdots+A_{ab}}
y_1^{A_{11}+\cdots+A_{a1}}\cdots y_b^{A_{1b}+\cdots+A_{ab}} .
\end{align*}
Thus a matrix $A$ contributes to the coefficient of $x^r y^c$ exactly when
\begin{align*}
A_{i1}+\cdots+A_{ib}&=r_i \quad \text{for each } i=1,\dots,a,\\
A_{1j}+\cdots+A_{aj}&=c_j \quad \text{for each } j=1,\dots,b.
\end{align*}
These are precisely the row-sum and column-sum conditions, so the desired coefficient is the number of non-negative integer matrices with margins $r$ and $c$.
[/example]
The matrix example gives the product-side coefficient interpretation. Direct enumeration of matrices with prescribed margins has no canonical decomposition by partition shape, so it does not by itself explain why tableaux should appear. To connect this count with tableaux, we now extract the same coefficient from the Schur-side expansion and express the answer through Kostka numbers.
[quotetheorem:5203]
[citeproof:5203]
The theorem converts a direct counting problem into a Schur-positive expansion. The equal-total hypothesis is necessary because a matrix has the same total sum whether computed from rows or columns; if $|r|\ne |c|$, the relevant coefficient is zero. The finite alphabet restriction is also part of the statement: $r$ and $c$ specify only finitely many rows and columns, while the stable symmetric-function identity supplies the expansion used to compute the coefficient. The formula gives an aggregate count through Kostka numbers; it does not construct the matrices individually or distinguish finer statistics without adding extra variables. It is a prototype for many later arguments: encode objects by a product, expand the product in Schur functions, and read structure constants or tableau numbers from coefficients.
## The Omega Involution And The Dual Cauchy Identity
The Cauchy kernel used complete homogeneous functions because the factors were geometric series. What changes if each pair of variables is allowed at most once rather than with arbitrary multiplicity? Replacing geometric factors by two-term factors cannot be obtained by a direct coefficient relabelling of the ordinary kernel, because complete symmetric functions must be exchanged for elementary symmetric functions in a way compatible with multiplication. The product becomes $\prod_{i,j}(1+x_i y_j)$, and the algebraic operation behind the change is the omega involution.
[definition: Omega Involution]
The omega involution is the graded algebra automorphism $\omega:\operatorname{Sym}\to\operatorname{Sym}$ determined by
\begin{align*}
\omega(h_n)=e_n
\end{align*}
for all $n\ge 1$.
[/definition]
Applying $\omega$ exchanges complete and elementary symmetric functions. To use it on the Schur Cauchy identity, we need its effect on Schur functions, and that effect is governed by conjugation of Young diagrams.
[quotetheorem:5204]
[citeproof:5204]
The theorem explains why omega is a shape-conjugating operation rather than only a generator-level substitution. The determinant formulas are needed here: knowing the images of the $h_n$ determines the algebra map, but the effect on Schur functions depends on the compatibility between Jacobi-Trudi and dual Jacobi-Trudi. The statement is specific to Schur functions and does not say that every symmetric-function basis is permuted by conjugating partitions. Now we apply $\omega$ in one alphabet to the ordinary Cauchy identity; on the product side, the transformation sends each geometric generating function for $h_n$ to the finite generating function for $e_n$.
[remark: Dual Cauchy Identity As An Omega Consequence]
Applying $\omega$ in the $y$-alphabet to the Schur Cauchy identity and using $\omega(s_\lambda)=s_{\lambda'}$ gives
\begin{align*}
\prod_{i,j \geq 1}(1+x_i y_j)=\sum_{\lambda} s_\lambda(x)s_{\lambda'}(y).
\end{align*}
This is the dual Cauchy identity.
[/remark]
The dual identity is the square-free analogue of the ordinary Cauchy identity. Applying $\omega$ to only one alphabet is essential: applying it to both alphabets would conjugate both Schur factors and would not change the product in the same way. As with the ordinary Cauchy identity, the infinite expression is formal and should be used degreewise or in finitely many variables for coefficient extraction. Its product side counts zero-one matrices, while its Schur side records conjugate shapes in the second alphabet.
[example: Zero-One Matrices With Fixed Margins]
Fix row sums $r=(r_1,\dots,r_a)$ and column sums $c=(c_1,\dots,c_b)$ with common total $N$, and write $x^r y^c=x_1^{r_1}\cdots x_a^{r_a}y_1^{c_1}\cdots y_b^{c_b}$. For each pair $(i,j)$,
\begin{align*}
1+x_i y_j=\sum_{\varepsilon\in\{0,1\}}(x_i y_j)^\varepsilon
=\sum_{\varepsilon\in\{0,1\}}x_i^\varepsilon y_j^\varepsilon .
\end{align*}
Multiplying these two-term choices over all pairs gives
\begin{align*}
\prod_{i=1}^{a}\prod_{j=1}^{b}(1+x_i y_j)
&=\prod_{i=1}^{a}\prod_{j=1}^{b}
\left(\sum_{A_{ij}\in\{0,1\}}x_i^{A_{ij}}y_j^{A_{ij}}\right)\\
&=\sum_{A_{11}\in\{0,1\}}\cdots\sum_{A_{ab}\in\{0,1\}}
\prod_{i=1}^{a}\prod_{j=1}^{b}x_i^{A_{ij}}y_j^{A_{ij}}\\
&=\sum_{A\in\{0,1\}^{a\times b}}
x_1^{A_{11}+\cdots+A_{1b}}\cdots x_a^{A_{a1}+\cdots+A_{ab}}
y_1^{A_{11}+\cdots+A_{a1}}\cdots y_b^{A_{1b}+\cdots+A_{ab}} .
\end{align*}
Therefore a zero-one matrix $A$ contributes to the coefficient of $x^r y^c$ exactly when
\begin{align*}
A_{i1}+\cdots+A_{ib}&=r_i \quad \text{for } i=1,\dots,a,\\
A_{1j}+\cdots+A_{aj}&=c_j \quad \text{for } j=1,\dots,b.
\end{align*}
These are precisely the prescribed row-sum and column-sum conditions, so the coefficient counts the desired zero-one matrices.
Using the *Dual Cauchy Identity*,
\begin{align*}
\prod_{i=1}^{a}\prod_{j=1}^{b}(1+x_i y_j)
=\sum_{\lambda}s_\lambda(x)s_{\lambda'}(y).
\end{align*}
Extracting the coefficient of $x^r y^c$ degree by degree gives
\begin{align*}
[x^r y^c]\prod_{i=1}^{a}\prod_{j=1}^{b}(1+x_i y_j)
&=[x^r y^c]\sum_{\lambda}s_\lambda(x)s_{\lambda'}(y)\\
&=\sum_{\lambda\vdash N}[x^r]s_\lambda(x)\,[y^c]s_{\lambda'}(y)\\
&=\sum_{\lambda\vdash N}K_{\lambda r}K_{\lambda' c}.
\end{align*}
Thus the same coefficient is counted directly by zero-one matrices and, after Schur expansion, by Kostka numbers with the second shape conjugated.
[/example]
The ordinary and dual Cauchy identities are the main kernel identities of the course. They turn the Hall inner product, Schur self-duality, tableau enumeration, and partition conjugation into concrete product formulas, and they prepare the ground for using Schur functions as structure constants in the Littlewood-Richardson rule.
The Cauchy kernel and Schur self-duality now make it possible to study multiplication by the simplest symmetric functions in a precise combinatorial form. Chapter 7 focuses on the Pieri rules and skew Schur functions, which isolate the first nontrivial structure constants of the Schur basis.
# 7. Pieri Rules and Skew Schur Functions
This chapter explains how Schur functions behave when multiplied by the simplest symmetric functions, the complete homogeneous functions $h_r$ and the elementary functions $e_r$. It assumes the earlier construction of Schur functions from semistandard tableaux, the Jacobi-Trudi identity, conjugate partitions, and the standard bases $h_r$ and $e_r$ of the symmetric-function ring $\operatorname{Sym}$. The answer is given by the Pieri rules: multiplying by $h_r$ adds boxes with no two in the same column, while multiplying by $e_r$ adds boxes with no two in the same row. We then extend Schur functions from ordinary Young diagrams to skew diagrams, where the Jacobi-Trudi determinant remains the main algebraic tool and prepares the ground for Littlewood-Richardson coefficients.
## Horizontal and Vertical Strips
The first question is how to describe the shapes that can be obtained by adding a fixed number of boxes to a Young diagram while preserving the partition condition. A naive instruction such as “add $r$ boxes” is too coarse: adding two boxes to $(3,1)$ could produce a new row, extend existing rows, or stack boxes in one column, and these possibilities behave differently under multiplication. The Pieri rules separate two extremal ways of adding boxes: spreading them across columns, or spreading them across rows.
[definition: Skew Diagram]
Let $\lambda$ and $\mu$ be partitions with $\mu_i \le \lambda_i$ for all $i$. The skew diagram $\lambda / \mu$ is the set-theoretic difference of Young diagrams
\begin{align*}
\lambda / \mu = \{(i,j) : 1 \le j \le \lambda_i\} \setminus \{(i,j) : 1 \le j \le \mu_i\}.
\end{align*}
[/definition]
The size of the skew diagram is $|\lambda / \mu| = |\lambda| - |\mu|$. This numerical datum alone does not determine the relevant multiplication behaviour: for instance, the two added boxes in $(5,1)/(3,1)$ lie in one row, whereas the two added boxes in $(3,1,1,1)/(3,1)$ lie in one column.
A skew diagram records the boxes added when passing from $\mu$ to $\lambda$. The first obstruction is column collision: if two newly added boxes lie in the same column, then they cannot both come from multiplying by a one-row Schur function, because the tableau contribution of a row shape has no vertical room. This leads to the row-like notion used by multiplication by $h_r$.
[definition: Horizontal Strip]
Let $\lambda$ and $\mu$ be partitions with $\mu \,\subset\, \lambda$. The skew diagram $\lambda / \mu$ is a horizontal strip of size $r$ if $|\lambda / \mu| = r$ and no two boxes of $\lambda / \mu$ lie in the same column.
[/definition]
The phrase "horizontal" refers to the fact that the new boxes may be distributed along rows but cannot stack in a column. Since multiplication by $e_r$ will be controlled by the conjugate picture, we also need the corresponding condition that prevents two added boxes from occupying the same row.
[definition: Vertical Strip]
Let $\lambda$ and $\mu$ be partitions with $\mu \,\subset\, \lambda$. The skew diagram $\lambda / \mu$ is a vertical strip of size $r$ if $|\lambda / \mu| = r$ and no two boxes of $\lambda / \mu$ lie in the same row.
[/definition]
Vertical strips are the conjugate version of horizontal strips. Indeed, $\lambda / \mu$ is vertical iff $\lambda' / \mu'$ is horizontal, because conjugating Young diagrams interchanges rows and columns.
[example: Two-Box Strips For A Small Partition]
Let $\mu=(3,1)$. Its Young diagram has boxes
\begin{align*}
\{(1,1),(1,2),(1,3),(2,1)\}.
\end{align*}
Adding two boxes means choosing a partition $\lambda\supset \mu$ with $|\lambda|=|\mu|+2=6$. The possible partitions of $6$ containing $(3,1)$ are
\begin{align*}
(5,1),\quad (4,2),\quad (4,1,1),\quad (3,3),\quad (3,2,1),\quad (3,1,1,1).
\end{align*}
For these six choices, the added boxes $\lambda/\mu$ are respectively
\begin{align*}
(5,1)/(3,1)&=\{(1,4),(1,5)\},\\
(4,2)/(3,1)&=\{(1,4),(2,2)\},\\
(4,1,1)/(3,1)&=\{(1,4),(3,1)\},\\
(3,3)/(3,1)&=\{(2,2),(2,3)\},\\
(3,2,1)/(3,1)&=\{(2,2),(3,1)\},\\
(3,1,1,1)/(3,1)&=\{(3,1),(4,1)\}.
\end{align*}
A horizontal strip has no two added boxes in the same column. The column sets above are
\begin{align*}
\{4,5\},\quad \{4,2\},\quad \{4,1\},\quad \{2,3\},\quad \{2,1\},\quad \{1,1\},
\end{align*}
so the horizontal strips of size $2$ give
\begin{align*}
(5,1),\quad (4,2),\quad (4,1,1),\quad (3,3),\quad (3,2,1),
\end{align*}
and exclude $(3,1,1,1)$ because both added boxes lie in column $1$.
A vertical strip has no two added boxes in the same row. The row sets for the same six skew diagrams are
\begin{align*}
\{1,1\},\quad \{1,2\},\quad \{1,3\},\quad \{2,2\},\quad \{2,3\},\quad \{3,4\},
\end{align*}
so the vertical strips of size $2$ give
\begin{align*}
(4,2),\quad (4,1,1),\quad (3,2,1),\quad (3,1,1,1),
\end{align*}
and exclude $(5,1)$ and $(3,3)$ because their two added boxes lie in one row. Thus the two strip conditions overlap in $(4,2)$, $(4,1,1)$, and $(3,2,1)$, but they do not coincide.
[/example]
This combinatorial distinction is designed to match the two simplest Schur functions. Since $h_r = s_{(r)}$ is indexed by one row and $e_r = s_{(1^r)}$ is indexed by one column, multiplying by $h_r$ should add a row-like object, while multiplying by $e_r$ should add a column-like object.
[illustration:horizontal-vertical-strips-mu-31]
## Pieri Rules for Complete and Elementary Multiplication
The next problem is to turn the strip conditions into identities in the ring of symmetric functions. Degree alone gives only a weak constraint: $h_2s_{31}$ must be a combination of Schur functions of degree $6$, but this does not explain why $s_{3111}$ is excluded, nor why the allowed coefficients should be $1$. The Pieri rules refine this naive degree count by saying exactly which shapes survive and why no multiplicities appear.
[quotetheorem:5205]
[citeproof:5205]
The hypotheses also describe the boundary cases. When $r=0$, the only horizontal strip of size $0$ is the empty skew diagram $\mu/\mu$, so the formula says $h_0s_\mu=s_\mu$. For $r>0$, the containment $\mu\subset\lambda$ ensures that the skew diagram represents boxes added to $\mu$ rather than a formal difference of unrelated diagrams.
The no-two-in-one-column hypothesis is not cosmetic, and it marks the necessity of the theorem's shape condition. For example, if one tried to include $\lambda=(3,1,1,1)$ in $h_2s_{31}$, the two new boxes below the first column would have to be accounted for by a one-row factor, but the tableau bijection would force two added entries into the same column and violate the column-strict mechanism behind the rule. The theorem also has a definite limitation: it does not say how to multiply by an arbitrary Schur function, how to compute coefficients larger than $1$, or how to decide the full product $s_\mu s_\nu$ when $\nu$ has more than one row. Its role is therefore both local and preparatory: it gives a multiplicity-free rule for a single row factor, and repeated use of that rule becomes the first mechanism for expanding determinants and, later, for approaching the Littlewood-Richardson rule.
[example: Complete Pieri Expansion For A Small Partition]
For $\mu=(3,1)$ and $r=2$, *Horizontal Pieri Rule* says that $h_2s_{31}$ is the sum of $s_\lambda$ over all partitions $\lambda\supset(3,1)$ for which $\lambda/(3,1)$ has two boxes and no two of those boxes lie in the same column. The six partitions of size $6$ containing $(3,1)$ are
\begin{align*}
(5,1),\quad (4,2),\quad (4,1,1),\quad (3,3),\quad (3,2,1),\quad (3,1,1,1).
\end{align*}
Their added boxes are
\begin{align*}
(5,1)/(3,1)&=\{(1,4),(1,5)\},\\
(4,2)/(3,1)&=\{(1,4),(2,2)\},\\
(4,1,1)/(3,1)&=\{(1,4),(3,1)\},\\
(3,3)/(3,1)&=\{(2,2),(2,3)\},\\
(3,2,1)/(3,1)&=\{(2,2),(3,1)\},\\
(3,1,1,1)/(3,1)&=\{(3,1),(4,1)\}.
\end{align*}
The corresponding column sets are
\begin{align*}
\{4,5\},\quad \{4,2\},\quad \{4,1\},\quad \{2,3\},\quad \{2,1\},\quad \{1,1\}.
\end{align*}
Thus the first five skew diagrams are horizontal strips of size $2$, while $(3,1,1,1)/(3,1)$ is not, because both added boxes have column coordinate $1$. Therefore
\begin{align*}
h_2 s_{31}
&= s_{51}+s_{42}+s_{411}+s_{33}+s_{321}.
\end{align*}
This is the concrete instance of the rule: multiplying by $h_2$ keeps exactly the two-box additions with at most one new box in each column.
[/example]
The elementary case should now be obtained without repeating the tableau argument from scratch. The involution $\omega:\operatorname{Sym}\to\operatorname{Sym}$, introduced in Chapter 3 and applied to Schur functions in Chapter 6, is the ring automorphism of the symmetric-function ring that sends $h_r$ to $e_r$ and $s_\lambda$ to $s_{\lambda'}$. Applying $\omega$ to the horizontal rule therefore asks for the transposed strip condition.
[quotetheorem:5206]
[citeproof:5206]
Again the case $r=0$ gives the empty strip and the identity $e_0s_\mu=s_\mu$. The partition hypotheses are needed for the same reason as before: the expression $\lambda/\mu$ must be an actual skew diagram, and conjugation must carry it to the horizontal strip $\lambda'/\mu'$ used in the proof.
The no-two-in-one-row condition is the exact transpose of the obstruction in the horizontal rule, so it is necessary rather than a decorative analogue. For instance, $(5,1)/(3,1)$ adds two boxes in the first row, so it cannot occur in $e_2s_{31}$: a one-column factor has no horizontal room for two added boxes in the same row. The theorem is correspondingly limited to multiplication by a single column factor; it does not by itself handle products with a general shape, nor does it produce the multiplicities that appear once both rows and columns of the second factor interact. This limitation is productive: the vertical rule complements the horizontal rule, and the pair supplies the recursive tools needed to convert Jacobi-Trudi determinants for skew shapes into Schur expansions.
[example: Elementary Pieri Expansion For A Small Partition]
For $\mu=(3,1)$ and $r=2$, *Vertical Pieri Rule* says that $e_2s_{31}$ is the sum of $s_\lambda$ over all partitions $\lambda\supset(3,1)$ such that $\lambda/(3,1)$ has two boxes and no two of those boxes lie in the same row. Since $|\mu|=3+1=4$, the outer partitions have size $4+2=6$. The partitions of $6$ containing $(3,1)$ are
\begin{align*}
(5,1),\quad (4,2),\quad (4,1,1),\quad (3,3),\quad (3,2,1),\quad (3,1,1,1).
\end{align*}
Their added boxes are
\begin{align*}
(5,1)/(3,1)&=\{(1,4),(1,5)\},\\
(4,2)/(3,1)&=\{(1,4),(2,2)\},\\
(4,1,1)/(3,1)&=\{(1,4),(3,1)\},\\
(3,3)/(3,1)&=\{(2,2),(2,3)\},\\
(3,2,1)/(3,1)&=\{(2,2),(3,1)\},\\
(3,1,1,1)/(3,1)&=\{(3,1),(4,1)\}.
\end{align*}
The corresponding row sets are
\begin{align*}
\{1,1\},\quad \{1,2\},\quad \{1,3\},\quad \{2,2\},\quad \{2,3\},\quad \{3,4\}.
\end{align*}
Thus the vertical strips are precisely those with row sets $\{1,2\}$, $\{1,3\}$, $\{2,3\}$, and $\{3,4\}$, giving
\begin{align*}
e_2 s_{31}
&= s_{42}+s_{411}+s_{321}+s_{3111}.
\end{align*}
The partitions $(5,1)$ and $(3,3)$ are excluded because their two added boxes have row coordinates $\{1,1\}$ and $\{2,2\}$ respectively, so each violates the no-two-in-one-row condition for a vertical strip.
[/example]
Together, the two Pieri rules give recursive control over many Schur expansions. Since every $h_\alpha = h_{\alpha_1}\cdots h_{\alpha_k}$ can be multiplied one factor at a time, Pieri multiplication builds larger Schur-positive expressions from repeated strip additions.
This recursive viewpoint is also the combinatorial shadow of several representation-theoretic branching laws. In the representation theory of symmetric groups and general linear groups, adding a horizontal or vertical strip describes how irreducible objects change when tensored with a symmetric or exterior power. The Pieri rules therefore connect the arithmetic of symmetric functions with concrete branching behaviour in polynomial representations, not merely with later Schur-product calculations.
## Skew Schur Functions and Skew Jacobi-Trudi Formulae
The final problem of the chapter is to define the symmetric function attached to a skew shape itself, not only to the process of adding that shape. Skew Schur functions package the difference between two partitions into a single object and make the coefficients in Schur products visible.
[definition: Semistandard Skew Tableau]
Let $\lambda$ and $\mu$ be partitions with $\mu \,\subset\, \lambda$. A semistandard skew tableau of shape $\lambda/\mu$ is a filling of the boxes of $\lambda/\mu$ by positive integers such that entries weakly increase from left to right along rows and strictly increase from top to bottom down columns.
[/definition]
This is the same semistandard rule used for ordinary Schur functions, but applied only to the boxes that remain after removing $\mu$. The missing inner shape affects which entries can sit next to each other, so skew shapes carry more structure than their multiset of row lengths.
[definition: Skew Schur Function]
Let $\lambda$ and $\mu$ be partitions with $\mu \,\subset\, \lambda$. The skew Schur construction assigns to the skew shape $\lambda/\mu$ the element $s_{\lambda/\mu}$ of the ring of symmetric functions $\operatorname{Sym}$ defined by
\begin{align*}
s_{\lambda/\mu} = \sum_T x^T,
\end{align*}
where the sum is over all semistandard skew tableaux $T$ of shape $\lambda/\mu$, and $x^T = \prod_i x_i^{m_i(T)}$ with $m_i(T)$ the number of entries equal to $i$.
[/definition]
When $\mu=\varnothing$, this is the ordinary Schur function $s_\lambda$. The symmetry of $s_{\lambda/\mu}$ is less immediate from the definition, but it follows from the determinant formula below and is one reason the Jacobi-Trudi identity is so useful.
[quotetheorem:5207]
[citeproof:5207]
Each hypothesis in the determinant is doing work. The containment condition $\mu\,\subset\,\lambda$ is needed because otherwise removing the boxes of $\mu$ from $\lambda$ does not leave a Young-diagram difference that can be filled by tableaux. For example, if $\lambda=(1)$ and $\mu=(2)$, the proposed inner diagram already extends past the outer diagram in the first row, so there is no collection of remaining boxes whose rows and columns could carry the semistandard conditions. The determinant may still contain formal entries such as $h_{-1}=0$, but those entries no longer enumerate skew tableaux. The condition $\ell\ge \ell(\lambda)$ ensures that every nonzero row of the outer shape appears in the determinant: taking only $\ell=1$ for $\lambda=(1,1)$ and $\mu=\varnothing$ would give $h_1$ instead of $s_{11}$. Finally, the conventions $h_0=1$ and $h_m=0$ for $m<0$ encode empty path segments and impossible path displacements; without them the determinant would not specialise correctly in small or triangular cases. This determinant also sets up the Littlewood-Richardson problem: once $s_{\lambda/\mu}$ is expanded in the ordinary Schur basis, its coefficients become the structure constants for Schur products.
[illustration:skew-shape-lambda-mu]
There is also a dual determinant in elementary symmetric functions obtained by conjugating the skew shape. It is often called the skew dual Jacobi-Trudi identity and reads
\begin{align*}
s_{\lambda/\mu} = \det\left(e_{\lambda'_i - \mu'_j - i + j}\right)_{1 \le i,j \le \lambda_1}.
\end{align*}
Both formulae reduce skew Schur functions to the classical $h$ and $e$ bases. Their limitation is that they give a determinantal expression rather than direct Schur-basis coefficients; extracting those coefficients still requires Pieri iteration or, in full generality, the Littlewood-Richardson rule.
[example: Skew Jacobi-Trudi Computation]
Let $\lambda=(4,2,1)$ and $\mu=(2,1)$. Extend $\mu$ by a trailing zero, so $\mu_1=2$, $\mu_2=1$, and $\mu_3=0$. Taking $\ell=3$, the entries in the skew Jacobi-Trudi matrix are
\begin{align*}
\lambda_i-\mu_j-i+j
&=
\begin{pmatrix}
4-2-1+1 & 4-1-1+2 & 4-0-1+3\\
2-2-2+1 & 2-1-2+2 & 2-0-2+3\\
1-2-3+1 & 1-1-3+2 & 1-0-3+3
\end{pmatrix}\\
&=
\begin{pmatrix}
2 & 4 & 6\\
-1 & 1 & 3\\
-3 & -1 & 1
\end{pmatrix}.
\end{align*}
By the convention $h_m=0$ for $m<0$, *Skew Jacobi-Trudi Identity* gives
\begin{align*}
s_{421/21}
&=
\det
\begin{pmatrix}
h_2 & h_4 & h_6\\
h_{-1} & h_1 & h_3\\
h_{-3} & h_{-1} & h_1
\end{pmatrix}\\
&=
\det
\begin{pmatrix}
h_2 & h_4 & h_6\\
0 & h_1 & h_3\\
0 & 0 & h_1
\end{pmatrix}\\
&=
h_2(h_1h_1-h_3\cdot 0)-h_4(0\cdot h_1-h_3\cdot 0)+h_6(0\cdot 0-h_1\cdot 0)\\
&=h_2h_1^2.
\end{align*}
Now use $h_2=s_2$ and $h_1=s_1$. First, *Horizontal Pieri Rule* with $r=1$ and $\mu=(2)$ adds one box to $(2)$, giving the two possible outer partitions $(3)$ and $(2,1)$:
\begin{align*}
h_2h_1=s_2h_1=s_3+s_{21}.
\end{align*}
Multiplying once more by $h_1$ and applying the same rule to each summand gives
\begin{align*}
h_2h_1^2
&=(s_3+s_{21})h_1\\
&=s_3h_1+s_{21}h_1\\
&=(s_4+s_{31})+(s_{31}+s_{22}+s_{211})\\
&=s_4+2s_{31}+s_{22}+s_{211}.
\end{align*}
Therefore
\begin{align*}
s_{421/21}=s_4+2s_{31}+s_{22}+s_{211}.
\end{align*}
This example shows the full passage from a skew Jacobi-Trudi determinant to a Schur-basis expansion: first reduce the determinant to a product of complete homogeneous functions, then expand that product by successive horizontal strip additions.
[/example]
Skew Schur functions are the bridge from the Pieri rules of this chapter to the Littlewood-Richardson rule of Chapter 8. The coefficient of $s_\lambda$ in $s_\mu s_\nu$ can be interpreted through skew Schur functions, and the next chapter turns that interpretation into a tableau rule for all Schur products, not only multiplication by a single row or column.
After the Pieri rules, the natural next problem is to understand arbitrary Schur products rather than only multiplication by a row or column. Chapter 8 resolves that by developing the Littlewood-Richardson rule, the full tableau calculus for Schur structure constants.
# 8. Littlewood-Richardson Rule
This chapter explains how Schur functions multiply. Chapters 4 and 5 constructed Schur functions from tableaux and determinants, so we already know from the Schur basis theorem that the Schur functions form a basis of the symmetric functions. The new question is structural: when two Schur functions are multiplied, which Schur functions occur, and with what coefficients? The answer is the Littlewood-Richardson rule, which turns the structure constants of the Schur basis into a counting problem for skew tableaux.
## Schur Products and Structure Constants
The product $s_\lambda s_\mu$ is homogeneous of degree $|\lambda|+|\mu|$, so its Schur expansion can only involve partitions of that size. Since the Schur functions form a basis of $\operatorname{Sym}$, the coefficients in this expansion exist and are unique; the problem is to describe them without multiplying determinants. A naive expansion through monomial or complete symmetric functions gives alternating sums after changing basis, so it gives no direct reason for the final Schur coefficients to be non-negative.
[definition: Littlewood-Richardson Coefficient]
For partitions $\lambda$, $\mu$, and $\nu$, the Littlewood-Richardson coefficient
\begin{align*}
c^\nu_{\lambda,\mu}\in\mathbb Z
\end{align*}
is the integer defined by
\begin{align*}
s_\lambda s_\mu = \sum_{\rho} c^\rho_{\lambda,\mu}s_\rho,
\end{align*}
where the sum is over partitions $\rho$ with $|\rho|=|\lambda|+|\mu|$.
[/definition]
The coefficient records the multiplicative structure of the Schur basis. Positivity is not immediate from this definition, because an algebraic expansion could involve cancellations. The point of the Littlewood-Richardson rule is that $c^\nu_{\lambda,\mu}$ is the cardinality of an explicit finite set.
[remark: Degree And Containment Constraints]
If $c^\nu_{\lambda,\mu}\ne 0$, then $|\nu|=|\lambda|+|\mu|$ and $\lambda\subset \nu$ after drawing Young diagrams in the standard top-left aligned convention. The containment condition appears because the tableaux counted by the rule live on the skew diagram $\nu/\lambda$.
[/remark]
This reduces product computations to a finite search through possible outer shapes $\nu$. For small products it is often faster to draw the admissible skew shapes and fill them than to manipulate symmetric polynomials directly.
[example: Product Of $s_{21}$ And $s_2$]
The product $s_{21}s_2$ has degree $|21|+|2|=3+2=5$, so only partitions $\nu$ of $5$ can occur. The partitions of $5$ containing $(2,1)$ are
\begin{align*}
(4,1),\quad (3,2),\quad (3,1,1),\quad (2,2,1),\quad (2,1,1,1).
\end{align*}
Since $s_2$ corresponds to content $(2)$, every skew filling has two entries equal to $1$. Such a filling is semistandard exactly when the two added boxes are not in the same column: rows are automatically weakly increasing, and a column containing two added boxes would contain $1$ above $1$, violating column strictness. The reading word is then $11$, whose prefixes are $1$ and $11$, so the lattice condition holds.
Thus the allowed additions are
\begin{align*}
(2,1)\subset (4,1) &: \text{new boxes in columns }3,4,\\
(2,1)\subset (3,2) &: \text{new boxes in columns }3,2,\\
(2,1)\subset (3,1,1) &: \text{new boxes in columns }3,1,\\
(2,1)\subset (2,2,1) &: \text{new boxes in columns }2,1.
\end{align*}
The partition $(2,1,1,1)$ is excluded because the two new boxes would both lie in column $1$. By *Pieri's rule*, equivalently by counting these Littlewood-Richardson tableaux, each allowed shape contributes once, so
\begin{align*}
s_{21}s_2=s_{41}+s_{32}+s_{311}+s_{221}.
\end{align*}
This computation shows that multiplying by $s_2$ adds a horizontal $2$-strip to the original Young diagram.
[/example]
## Reading Words and Lattice Conditions
A skew semistandard tableau contains more data than its shape and content. To decide whether it contributes to a Schur product, we need an order in which to read its entries and a condition saying that the entries are balanced at every stage of that reading. Reading in the wrong order can accept the wrong fillings: for example, a vertical skew column filled with $1$ above $2$ has column-strict entries, but the top-to-bottom word $12$ behaves differently from the bottom-to-top word $21$ under the lattice test.
[definition: Reading Word]
Let $\lambda\subset\nu$ be partitions. The reading word is the map
\begin{align*}
w: \operatorname{SSYT}(\nu/\lambda) \to \mathbb{Z}_{>0}^{*}
\end{align*}
which sends a skew semistandard tableau $T$ to the word $w(T)$ obtained by reading each row from right to left, starting with the top row and moving downward.
[/definition]
The right-to-left convention is designed to match the combinatorics of row insertion and jeu de taquin. After fixing the word, we need to recognise exactly those words that can rectify to the highest-weight tableau of a prescribed content.
[definition: Lattice Word]
A word $w=w_1\cdots w_N$ in positive integers is a lattice word if, for every prefix $w_1\cdots w_k$ and every $i\ge 1$, the number of letters equal to $i$ in that prefix is at least the number of letters equal to $i+1$ in that prefix.
[/definition]
Thus a lattice word never has more letters equal to $2$ than letters equal to $1$ at any point, never has more letters equal to $3$ than letters equal to $2$ at any point, and so on. We now combine this word condition with the semistandard row and column conditions to isolate the tableaux counted by Schur multiplication.
[definition: Littlewood-Richardson Tableau]
Let $\lambda\subset \nu$ be partitions, and let $\mu$ be a partition with $|\mu|=|\nu|-|\lambda|$. A Littlewood-Richardson tableau of shape $\nu/\lambda$ and content $\mu$ is a semistandard Young tableau of skew shape $\nu/\lambda$ whose reading word is a lattice word and whose entry $i$ occurs $\mu_i$ times.
[/definition]
The definition combines three constraints: row weak increase, column strict increase, and the lattice condition. The first two constraints make the filling a semistandard tableau; the third selects the tableaux compatible with Schur multiplication.
[example: Product Of $s_{22}$ And $s_{11}$]
We compute $s_{22}s_{11}$ using the vertical form of *Pieri's rule*: multiplying by $s_{11}=e_2$ means adding two boxes to the diagram of $(2,2)$ with no two added boxes in the same row. Since
\begin{align*}
|(2,2)|+|(1,1)|=4+2=6,
\end{align*}
we only consider partitions of $6$ containing $(2,2)$. They are
\begin{align*}
(4,2),\quad (3,3),\quad (3,2,1),\quad (2,2,2),\quad (2,2,1,1).
\end{align*}
The skew shapes are tested by the positions of the two added boxes:
\begin{align*}
(4,2)/(2,2) &: \text{two boxes in row }1,\\
(3,3)/(2,2) &: \text{one box in row }1\text{ and one in row }2,\\
(3,2,1)/(2,2) &: \text{one box in row }1\text{ and one in row }3,\\
(2,2,2)/(2,2) &: \text{two boxes in row }3,\\
(2,2,1,1)/(2,2) &: \text{one box in row }3\text{ and one in row }4.
\end{align*}
Thus $(4,2)$ and $(2,2,2)$ are excluded, while $(3,3)$, $(3,2,1)$, and $(2,2,1,1)$ remain.
Equivalently, in Littlewood-Richardson tableau language the content is $(1,1)$, so the two entries are one $1$ and one $2$. For $(3,3)/(2,2)$ the two skew boxes lie in column $3$, so column strictness forces
\begin{align*}
\begin{array}{c}
1\\
2
\end{array}
\end{align*}
and the reading word is $12$. For $(3,2,1)/(2,2)$ the two skew boxes are not adjacent, but the reading order sees the row $1$ box before the row $3$ box; the lattice condition rules out the prefix $2$, so the reading word must be $12$. For $(2,2,1,1)/(2,2)$ the two skew boxes lie in column $1$, so column strictness again forces the upper box to be $1$ and the lower box to be $2$, giving reading word $12$. Each surviving shape therefore contributes exactly one Littlewood-Richardson tableau, and
\begin{align*}
s_{22}s_{11}=s_{33}+s_{321}+s_{2211}.
\end{align*}
This example shows the vertical Pieri rule in its smallest nontrivial form: multiplication by $e_2$ adds a vertical $2$-strip.
[/example]
## The Littlewood-Richardson Rule
The central problem is to prove that the tableaux just defined compute the coefficients in Schur multiplication. The rule is stronger than a positivity statement: it gives an algorithm for expanding every product in the Schur basis. It also supplies the same integers that appear in tensor product multiplicities for polynomial representations of general linear groups: if $V_\lambda$ and $V_\mu$ are irreducible polynomial representations of $GL_n$ with sufficiently many variables, then $c^\nu_{\lambda,\mu}$ is the multiplicity of $V_\nu$ in $V_\lambda\otimes V_\mu$. In Schubert calculus the same coefficient is the structure constant for multiplying Schubert classes in the cohomology ring of a Grassmannian, and therefore counts the corresponding transverse intersection problem when the partitions fit inside the relevant rectangle. These interpretations explain why a tableau-counting rule controls phenomena that at first look representation-theoretic or geometric rather than symmetric-functional.
[quotetheorem:5183]
[citeproof:5183]
The theorem has several built-in boundary conditions. The size condition $|\nu|=|\lambda|+|\mu|$ is forced by grading, while the containment condition $\lambda\subseteq\nu$ is what makes the skew diagram $\nu/\lambda$ available for tableaux. The lattice-word condition is the nontrivial selection rule: a semistandard skew tableau with the right content can still fail to represent a Schur product coefficient if one of its prefixes has too many large entries too early. This is why the rule proves Schur positivity for products but does not imply positivity in arbitrary bases. Its later uses all depend on this precision: the same integer is a structure constant in $\operatorname{Sym}$, a tableau count, and, in representation-theoretic settings, a multiplicity.
[example: A Single Coefficient]
Let $\lambda=(2,1)$, $\mu=(2,1)$, and $\nu=(3,2,1)$. Since
\begin{align*}
|\lambda|&=2+1=3,\\
|\mu|&=2+1=3,\\
|\nu|&=3+2+1=6,
\end{align*}
the size condition $|\nu|=|\lambda|+|\mu|$ holds. The skew diagram $\nu/\lambda$ consists of the boxes
\begin{align*}
(1,3),\quad (2,2),\quad (3,1),
\end{align*}
so it has $6-3=3$ boxes, matching the content $(2,1)$.
Fill these boxes by
\begin{align*}
T(1,3)=1,\qquad T(2,2)=1,\qquad T(3,1)=2.
\end{align*}
There are no two skew boxes in the same row and no two skew boxes in the same column, so the semistandard row and column conditions impose no further inequalities. The reading order is top row to bottom row, and within each row from right to left, hence
\begin{align*}
w(T)=1\,1\,2.
\end{align*}
Its prefixes are
\begin{align*}
1,\qquad 11,\qquad 112.
\end{align*}
For these prefixes the counts of entries equal to $1$ and entries equal to $2$ are
\begin{align*}
1 &: \#1=1,\ \#2=0,\\
11 &: \#1=2,\ \#2=0,\\
112 &: \#1=2,\ \#2=1,
\end{align*}
so every prefix has at least as many entries equal to $1$ as entries equal to $2$. Thus $w(T)$ is a lattice word, and $T$ is a Littlewood-Richardson tableau of shape $(3,2,1)/(2,1)$ and content $(2,1)$. Therefore this tableau contributes one count to $c^{321}_{21,21}$; determining the full coefficient requires checking the other fillings with two entries equal to $1$ and one entry equal to $2$ on the same three skew boxes.
[/example]
## Skew Schur Functions
Products are only one side of the rule. The same coefficients also describe the skew Schur functions introduced in Chapter 7, which are generating functions for semistandard tableaux of a fixed skew shape.
[definition: Skew Schur Function]
Let $\lambda\subset \nu$ be partitions. The skew Schur function $s_{\nu/\lambda}\in \operatorname{Sym}$ is the generating function
\begin{align*}
s_{\nu/\lambda}=\sum_T x^T,
\end{align*}
where the variables are $x_1,x_2,\dots$, the sum is over semistandard Young tableaux $T$ of shape $\nu/\lambda$ with positive-integer entries, and $x^T$ denotes the monomial determined by the content of $T$: if $T$ has $\alpha_i$ entries equal to $i$, then $x^T=x_1^{\alpha_1}x_2^{\alpha_2}\cdots$.
[/definition]
The sum is understood in the usual completed ring of symmetric formal power series. Since the skew shape has finitely many boxes, the resulting homogeneous symmetric series represents an element of $\operatorname{Sym}$.
Skew Schur functions package all semistandard fillings of a skew shape into a single symmetric function. To express that function in the Schur basis, we again sort tableaux by rectification shape, so the same Littlewood-Richardson numbers should appear with the inner shape fixed.
[quotetheorem:5208]
[citeproof:5208]
This theorem is often the most efficient way to compute a skew Schur function: enumerate Littlewood-Richardson tableaux of the skew shape once, grouped by content. The hypothesis $\lambda\subset\nu$ is part of the data, not a decorative assumption: if $\lambda=(3,1)$ and $\nu=(2,2)$, then the proposed row lengths do not leave a legitimate skew diagram, so $s_{\nu/\lambda}$ is not defined by tableaux. Partition indexing also matters; replacing $\nu$ by a composition such as $(2,3)$ does not specify a Young diagram in the standard top-left aligned convention until the row lengths are sorted, and that sorting changes the proposed skew shape. The finite-shape condition is what keeps $s_{\nu/\lambda}$ homogeneous of finite degree, while an infinite diagram would require a different completed object rather than an element of $\operatorname{Sym}$. The theorem also explains why the same integers occur in two different places: as product structure constants and as skew Schur expansion coefficients. The limitation is that it still requires a finite tableau enumeration, so complicated skew shapes can produce many semistandard fillings before the lattice condition is applied. In the next computations, the practical task is therefore to use row, column, and prefix constraints early enough to avoid double-counting or missing fillings.
[example: Expansion Of $s_{432/21}$]
Let
\begin{align*}
a&=T(1,4),& b&=T(1,3),& c&=T(2,3),\\
d&=T(2,2),& e&=T(3,2),& f&=T(3,1).
\end{align*}
The reading word is $abcdef$. Row weak increase gives
\begin{align*}
b\le a,\qquad d\le c,\qquad f\le e,
\end{align*}
and column strictness in columns $3$ and $2$ gives
\begin{align*}
b<c,\qquad d<e.
\end{align*}
The first letter of a lattice word must be $1$, so $a=1$. Then $b\le a$ forces $b=1$, and $b<c$ forces $c>1$. Since the prefix $11c$ must still be lattice, $c$ cannot be $3$ or larger; hence $c=2$. Thus every contributing reading word begins with
\begin{align*}
abc=112.
\end{align*}
It remains to choose $d,e,f$ with
\begin{align*}
d\le 2,\qquad d<e,\qquad f\le e.
\end{align*}
The possible completions that satisfy the lattice condition are
\begin{align*}
112121 &\quad\text{with content }(4,2),\\
112131 &\quad\text{with content }(4,1,1),\\
112122 &\quad\text{with content }(3,3),\\
112132 &\quad\text{with content }(3,2,1),\\
112231 &\quad\text{with content }(3,2,1),\\
112233 &\quad\text{with content }(2,2,2).
\end{align*}
For example, the word $112231$ has prefix count triples
\begin{align*}
(1,0,0),\quad (2,0,0),\quad (2,1,0),\quad
(2,2,0),\quad (2,2,1),\quad (3,2,1),
\end{align*}
so at every prefix the number of entries equal to $1$ is at least the number of entries equal to $2$, and the number of entries equal to $2$ is at least the number of entries equal to $3$. The other displayed words are checked in the same way:
\begin{align*}
112121 &: (1,0,0),(2,0,0),(2,1,0),(3,1,0),(3,2,0),(4,2,0),\\
112131 &: (1,0,0),(2,0,0),(2,1,0),(3,1,0),(3,1,1),(4,1,1),\\
112122 &: (1,0,0),(2,0,0),(2,1,0),(3,1,0),(3,2,0),(3,3,0),\\
112132 &: (1,0,0),(2,0,0),(2,1,0),(3,1,0),(3,1,1),(3,2,1),\\
112233 &: (1,0,0),(2,0,0),(2,1,0),(2,2,0),(2,2,1),(2,2,2).
\end{align*}
A completion involving a $4$ cannot contribute: after the forced prefix $112$, the first $4$ would need a preceding $3$ in the same prefix, but the inequalities $d\le 2$, $d<e$, and $f\le e$ leave no placement of $3$ before that first $4$ without violating either lattice order or semistandardness.
Therefore the Littlewood-Richardson tableaux of shape $(4,3,2)/(2,1)$ have contents
\begin{align*}
(4,2),\quad (4,1,1),\quad (3,3),\quad (3,2,1),\quad (3,2,1),\quad (2,2,2).
\end{align*}
By *Skew Schur Expansion*, grouping these tableaux by content gives
\begin{align*}
s_{432/21}=s_{42}+s_{411}+s_{33}+2s_{321}+s_{222}.
\end{align*}
The coefficient $2$ appears only for content $(3,2,1)$, because exactly the two words $112132$ and $112231$ satisfy all row, column, and lattice constraints.
[/example]
## Symmetry and Commutativity
The ring $\operatorname{Sym}$ is commutative, so $s_\lambda s_\mu=s_\mu s_\lambda$. The combinatorial rule should therefore imply a symmetry that is not apparent from the definition, since $c^\nu_{\lambda,\mu}$ counts tableaux on $\nu/\lambda$ while $c^\nu_{\mu,\lambda}$ counts tableaux on $\nu/\mu$.
[quotetheorem:5209]
[citeproof:5209]
The symmetry is a useful check on computations. If a proposed list of coefficients violates it, then either the reading convention, the content, or the skew shape enumeration has been mishandled. The statement is not visible from the definition of a Littlewood-Richardson tableau, because the two sides count fillings of different skew shapes. Its proof therefore records a real compatibility between tableau switching and the commutativity of the symmetric-function ring. Later representation-theoretic interpretations read the same identity as symmetry of tensor product multiplicities.
[example: Symmetry Check]
The earlier product computation was
\begin{align*}
s_{21}s_2=s_{41}+s_{32}+s_{311}+s_{221},
\end{align*}
so the coefficient of $s_{32}$ is
\begin{align*}
c^{32}_{21,2}=1.
\end{align*}
By *Commutativity Symmetry*,
\begin{align*}
c^{32}_{2,21}=c^{32}_{21,2}=1.
\end{align*}
Thus there should be exactly one Littlewood-Richardson tableau of shape $(3,2)/(2)$ and content $(2,1)$.
Write the skew boxes as
\begin{align*}
a&=T(1,3),& b&=T(2,2),& c&=T(2,1).
\end{align*}
The reading word is $abc$, because the top row is read first and the second row is read from right to left. The content $(2,1)$ means that the entries contain two copies of $1$ and one copy of $2$. The first letter of a lattice word must be $1$, so $a=1$. The remaining entries $b,c$ are therefore $1$ and $2$ in some order. Row weak increase in the second row gives
\begin{align*}
c\le b.
\end{align*}
If $b=1$ and $c=2$, then $c\le b$ becomes $2\le 1$, impossible. Hence
\begin{align*}
b=2,\qquad c=1,
\end{align*}
and the only possible filling has reading word
\begin{align*}
w(T)=121.
\end{align*}
Its prefixes have counts
\begin{align*}
1 &: \#1=1,\ \#2=0,\\
12 &: \#1=1,\ \#2=1,\\
121 &: \#1=2,\ \#2=1.
\end{align*}
Each prefix has at least as many entries equal to $1$ as entries equal to $2$, so $121$ is a lattice word. Therefore the unique contributing tableau is
\begin{align*}
T(1,3)=1,\qquad T(2,2)=2,\qquad T(2,1)=1,
\end{align*}
which confirms combinatorially that $c^{32}_{2,21}=1$.
[/example]
## How to Compute with the Rule
The practical difficulty in using the rule is that the number of possible fillings grows quickly, and it is easy to miss a filling or count the same content in the wrong outer shape. A reliable Littlewood-Richardson computation therefore follows a fixed routine. First list all possible outer partitions $\nu$ of the correct size containing $\lambda$. Next draw the skew shape $\nu/\lambda$, impose semistandard row and column conditions, and then filter the remaining fillings by the lattice reading word condition. Finally group the surviving tableaux by outer shape for products, or by content for skew Schur functions.
[remark: Common Sources Of Error]
The reading word is taken row by row from right to left, starting at the top. The content $\mu$ records multiplicities of entries, not row lengths of the skew shape. The lattice condition must be checked on every prefix of the reading word, not only on the whole word.
[/remark]
These conventions make the Littlewood-Richardson rule a bridge between algebra and tableaux. It turns multiplication in $\operatorname{Sym}$ into a finite, positive, and computable counting problem, while jeu de taquin explains why this count is stable under the algebraic operations developed earlier in the course. The next chapter uses these Schur functions under specializations such as $x_i=q^{i-1}$ and $x_1=\cdots=x_n=1$ to extract tableau-counting formulas.
The Littlewood-Richardson rule turns Schur multiplication into a counting problem, and specializations turn those counts into explicit numerical formulas. Chapter 9 applies this machinery to finite alphabets and geometric progressions of variables, extracting enumerative consequences from the same tableau data.
# 9. Specializations and Enumerative Consequences
This chapter studies what Schur functions become after the variables are specialized to simple geometric progressions or to finite alphabets of ones. It uses the preceding material on partitions, Young diagrams, semistandard and standard Young tableaux, Schur functions, the bialternant formula, and basic generating functions. Chapters 4 through 8 developed Schur functions through tableaux, determinants, Cauchy identities, and product rules; here we turn those formulas into enumerative statements. The central theme is that a symmetric function identity can become a counting formula once its variables are chosen with enough structure.
## Principal Specialization
How can a symmetric function in infinitely many variables produce a finite product? The most useful specialization sends the variables to a geometric progression, so that the weight of a tableau records the sum of its entries rather than the full content vector.
[definition: Principal Specialization]
The principal specialization is the map
\begin{align*}
\operatorname{ps}:\operatorname{Sym} &\longrightarrow \mathbb Q[[q]],
& f &\longmapsto f(1,q,q^2,q^3,\dots).
\end{align*}
For $n \in \mathbb N$, the finite principal specialization is the map
\begin{align*}
\operatorname{ps}_n:\operatorname{Sym} &\longrightarrow \mathbb Q[q],
& f &\longmapsto f(1,q,q^2,\dots,q^{n-1},0,0,\dots).
\end{align*}
[/definition]
The infinite specialization is best understood in the completed ring of formal power series in $q$. The finite specialization is an honest polynomial whenever $f$ is homogeneous, and it raises the problem of evaluating familiar bases after the substitution $x_i=q^{i-1}$.
[example: Principal Specialization of Complete Functions]
For the complete symmetric function $h_r$, each monomial corresponds to a weakly increasing word $a_1\le \cdots \le a_r$ in the alphabet $\{1,\dots,n\}$. Under $\operatorname{ps}_n$, the letter $a$ contributes $q^{a-1}$, so the word $a_1\cdots a_r$ contributes
\begin{align*}
q^{(a_1-1)+\cdots+(a_r-1)}.
\end{align*}
Equivalently, choosing how many times each letter $i+1$ appears gives
\begin{align*}
\sum_{r\ge 0}\operatorname{ps}_n(h_r)t^r
&=\sum_{m_0,\dots,m_{n-1}\ge 0}(q^0t)^{m_0}(q^1t)^{m_1}\cdots(q^{n-1}t)^{m_{n-1}}\\
&=\prod_{i=0}^{n-1}\left(\sum_{m_i\ge 0}(q^it)^{m_i}\right)\\
&=\prod_{i=0}^{n-1}\frac{1}{1-q^it}.
\end{align*}
Thus $\operatorname{ps}_n(h_r)$ is the $q$-generating function for weakly increasing words of length $r$ with letters in $\{1,\dots,n\}$.
For $r=2$ and $n=3$, the weakly increasing words are
\begin{align*}
11,\quad 12,\quad 13,\quad 22,\quad 23,\quad 33.
\end{align*}
Their weights are
\begin{align*}
11&\mapsto q^{0+0}=1,\\
12&\mapsto q^{0+1}=q,\\
13&\mapsto q^{0+2}=q^2,\\
22&\mapsto q^{1+1}=q^2,\\
23&\mapsto q^{1+2}=q^3,\\
33&\mapsto q^{2+2}=q^4.
\end{align*}
Adding these contributions gives
\begin{align*}
\operatorname{ps}_3(h_2)
&=1+q+q^2+q^2+q^3+q^4\\
&=1+q+2q^2+q^3+q^4.
\end{align*}
The coefficient $2$ of $q^2$ records the two different words, $13$ and $22$, with the same shifted-entry sum.
[/example]
The complete functions show that principal specialization packages a weighted counting problem into a product. For Schur functions, direct tableau summation becomes difficult because the row and column inequalities couple the entries: choosing one entry changes which values remain legal nearby. The obstruction is that positivity alone does not explain why the answer should factor, so we turn to the bialternant formula, where specialization turns the determinant quotient into a Vandermonde quotient.
[quotetheorem:5210]
[citeproof:5210]
The hypothesis $\ell(\lambda)\le n$ is needed because a Schur function in $n$ variables vanishes for shapes with more than $n$ rows; for example $s_{(1,1)}(x_1)=0$, since a column of length two cannot be filled strictly using one letter. The theorem does not evaluate the full infinite principal specialization directly, nor does it say that each factor separately has a tableau interpretation; the product only becomes the specialized Schur polynomial after all cancellations are combined. Its role is to replace a constrained tableau sum by a product, which is exactly what will allow the unweighted limit $q\to 1$.
The formula separates into a monomial shift and a product of ratios. The next step is to let the product interact with hook lengths, so the exponent of the monomial factor must be isolated rather than carried through every display. This exponent records the forced contribution of the lowest possible entry in each row: row $i$ contributes at least $q^{i-1}$ for each of its $\lambda_i$ cells. We therefore introduce the statistic that measures this unavoidable row weight before turning to hook and content factors.
[definition: The Statistic n Lambda]
The row-weight statistic is the map
\begin{align*}
n:\{\text{partitions}\} &\longrightarrow \mathbb N\cup\{0\},
& \lambda &\longmapsto \sum_{i\ge 1}(i-1)\lambda_i.
\end{align*}
[/definition]
This statistic measures the contribution of the row number to the minimal semistandard filling. Since entries in row $i$ must be at least $i$ in the column-strict convention, the tableau of smallest principal weight has row $i$ filled with $i$.
[example: Principal Specialization for Shape Two One]
For $\lambda=(2,1)$ and $n=3$, extend $\lambda$ by zeros, so
\begin{align*}
\lambda_1=2,\qquad \lambda_2=1,\qquad \lambda_3=0,
\end{align*}
and
\begin{align*}
n(\lambda)=(1-1)\lambda_1+(2-1)\lambda_2+(3-1)\lambda_3=0\cdot2+1\cdot1+2\cdot0=1.
\end{align*}
The three factors in the principal-specialization product come from the pairs $(1,2)$, $(1,3)$, and $(2,3)$:
\begin{align*}
s_{(2,1)}(1,q,q^2)
&=q^{n(\lambda)}
\frac{1-q^{\lambda_1-\lambda_2+2-1}}{1-q^{2-1}}
\frac{1-q^{\lambda_1-\lambda_3+3-1}}{1-q^{3-1}}
\frac{1-q^{\lambda_2-\lambda_3+3-2}}{1-q^{3-2}}\\
&=q
\frac{1-q^{2-1+1}}{1-q}
\frac{1-q^{2-0+2}}{1-q^2}
\frac{1-q^{1-0+1}}{1-q}\\
&=q\frac{(1-q^2)(1-q^4)(1-q^2)}{(1-q)(1-q^2)(1-q)}.
\end{align*}
Now factor
\begin{align*}
1-q^2&=(1-q)(1+q),\\
1-q^4&=(1-q^2)(1+q^2),
\end{align*}
so
\begin{align*}
s_{(2,1)}(1,q,q^2)
&=q\frac{(1-q)(1+q)(1-q^2)(1+q^2)(1-q)(1+q)}{(1-q)(1-q^2)(1-q)}\\
&=q(1+q)^2(1+q^2)\\
&=q+2q^2+2q^3+2q^4+q^5.
\end{align*}
The lowest power is $q$, coming from the semistandard filling with top row $1,1$ and second row $2$, whose principal weight is $q^{(1-1)+(1-1)+(2-1)}=q$. A direct tableau check gives the same polynomial: the weights are $q$, two tableaux of weight $q^2$, two of weight $q^3$, two of weight $q^4$, and one of weight $q^5$. The two largest weights come from the fillings with entries $(2,2,3)$ and $(2,3,3)$ in reading order. Thus the factors above encode all allowed increases of entries while keeping rows weakly increasing and columns strictly increasing.
[/example]
## Hook Contents and Bounded Entries
The next question is what happens when $q$ tends to $1$. Principal specialization has retained the shape of the product, and the limit should count tableaux without weighting their entries.
[definition: Content of a Cell]
Let $\lambda$ be a partition. The content function is the map
\begin{align*}
c:\lambda &\longrightarrow \mathbb Z,
& u=(i,j) &\longmapsto j-i.
\end{align*}
[/definition]
The content records the diagonal on which a cell lies and explains the numerator that will appear after specialization. To describe the denominator of the same product, we need the local statistic attached to the cells lying to the right and below a chosen cell.
[definition: Hook Length of a Cell]
Let $\lambda$ be a partition. The hook-length function is the map
\begin{align*}
h:\lambda &\longrightarrow \mathbb N,
& u=(i,j) &\longmapsto \lambda_i-j+\lambda'_j-i+1,
\end{align*}
where $\lambda'$ is the conjugate partition.
[/definition]
The hook length counts the cell $u$, the cells to its right in the same row, and the cells below it in the same column. Contents and hooks together give the product that counts semistandard tableaux with a bounded alphabet.
[quotetheorem:5211]
[citeproof:5211]
The condition $\ell(\lambda)\le n$ is essential: if $\lambda=(1,1)$ and $n=1$, a column would need two strictly increasing entries chosen from the one-letter alphabet, so no tableau exists. The theorem does not count standard tableaux, and it does not give a visibly integral product term by term; integrality comes from the Schur function or tableau interpretation of the whole product. This formula is a finite-dimensional shadow of Schur positivity and also the dimension formula for polynomial irreducible representations of $GL_n$: the same product computes the dimension of the irreducible representation of highest weight $\lambda$, because semistandard tableaux index the corresponding weight basis. The course uses the tableau interpretation, but the representation-theoretic reading explains why the dependence on $n$ is polynomial and why the same hooks and contents recur.
This viewpoint turns the formula into a practical enumerative tool: compute the local contents and hook lengths, multiply the resulting factors, and rely on the tableau interpretation to explain the final integer.
[example: Semistandard Tableaux of Shape Three Two With Entries at Most Four]
Let $\lambda=(3,2)$ and $n=4$. We label the cells by their row and column coordinates:
\begin{align*}
(1,1),\quad (1,2),\quad (1,3),\quad (2,1),\quad (2,2).
\end{align*}
Their contents $c(i,j)=j-i$ are
\begin{align*}
c(1,1)&=1-1=0,\\
c(1,2)&=2-1=1,\\
c(1,3)&=3-1=2,\\
c(2,1)&=1-2=-1,\\
c(2,2)&=2-2=0.
\end{align*}
Since $\lambda'=(2,2,1)$, the hook lengths $h(i,j)=\lambda_i-j+\lambda'_j-i+1$ are
\begin{align*}
h(1,1)&=3-1+2-1+1=4,\\
h(1,2)&=3-2+2-1+1=3,\\
h(1,3)&=3-3+1-1+1=1,\\
h(2,1)&=2-1+2-2+1=2,\\
h(2,2)&=2-2+2-2+1=1.
\end{align*}
By the *Hook Content Formula*,
\begin{align*}
s_{(3,2)}(1^4)
&=\prod_{u\in(3,2)}\frac{4+c(u)}{h(u)}\\
&=\frac{4+0}{4}\cdot\frac{4+1}{3}\cdot\frac{4+2}{1}\cdot\frac{4+(-1)}{2}\cdot\frac{4+0}{1}\\
&=\frac{4}{4}\cdot\frac{5}{3}\cdot\frac{6}{1}\cdot\frac{3}{2}\cdot\frac{4}{1}\\
&=1\cdot\frac{5}{3}\cdot6\cdot\frac{3}{2}\cdot4\\
&=1\cdot10\cdot\frac{3}{2}\cdot4\\
&=15\cdot4\\
&=60.
\end{align*}
Thus there are $60$ semistandard Young tableaux of shape $(3,2)$ with entries in $\{1,2,3,4\}$.
[/example]
The same product immediately detects when tableaux cannot exist. If $\ell(\lambda)>n$, some cell in the first column has content $1-n$, so the numerator includes a zero factor.
[remark: Vanishing for Too Many Rows]
For $\ell(\lambda)>n$, there is no semistandard Young tableau of shape $\lambda$ with entries in $\{1,\dots,n\}$. A column of length greater than $n$ would require more than $n$ strictly increasing entries chosen from $\{1,\dots,n\}$.
[/remark]
## Standard Tableaux and the Hook Length Formula
Semistandard tableaux allow repeated entries, while standard tableaux use each number $1,\dots,|\lambda|$ once. The final specialization asks how standard tableaux arise as the lowest-order behaviour of semistandard enumeration when the number of allowed labels grows.
[definition: Standard Young Tableau]
Let $\lambda$ be a partition of $r$. A standard Young tableau of shape $\lambda$ is a filling of the cells of $\lambda$ with $1,\dots,r$, each used exactly once, such that rows increase from left to right and columns increase from top to bottom.
[/definition]
Standard tableaux are the linear extensions of the poset on the cells of $\lambda$ in which a cell must precede the cell to its right and the cell below it. Since this number occurs often in the specialization limit, we introduce a compact notation for it.
[definition: The Number f Lambda]
For $r\in\mathbb N$, the standard-tableau counting function is the map
\begin{align*}
f:\{\text{partitions of }r\} &\longrightarrow \mathbb N,
& \lambda &\longmapsto f_\lambda:=|\{\text{standard Young tableaux of shape }\lambda\}|.
\end{align*}
[/definition]
The notation $f_\lambda$ isolates the standard-tableau count from the semistandard counts $s_\lambda(1^n)$. The next problem is to recover $f_\lambda$ from the asymptotic behaviour of $s_\lambda(1^n)$ as the alphabet size grows.
[quotetheorem:5212]
[citeproof:5212]
The assumption that $\lambda$ has exactly $r$ cells is needed because the normalization $n^r$ matches the degree of the semistandard counting polynomial; using $n^{r+1}$ would force the limit to $0$, while using $n^{r-1}$ would usually diverge. There is also a genuine boundary in the standardization argument: the leading coefficient is obtained with $\lambda$ fixed while $n$ grows. If instead the shape were allowed to grow with $n$, the descent constraints and the number of cells would change at the same time as the alphabet, so the fixed-degree polynomial argument would no longer apply. The theorem does not give the lower-order coefficients of $s_\lambda(1^n)$, which still remember descent restrictions in the standardization process. Its purpose is to identify the leading coefficient with the standard-tableau count, so that the hook-content product can now be converted into a closed formula for $f_\lambda$.
This limit extracts only the top-degree term in $n$. Because the hook-content formula gives $s_\lambda(1^n)$ as an explicit degree $r$ polynomial in $n$, the remaining problem is to read off only the leading coefficient of that product.
[quotetheorem:5213]
[citeproof:5213]
The hypothesis $|\lambda|=r$ is needed because $r!$ counts the possible total orderings of the $r$ cells before the row and column constraints are imposed; for instance, using $5!$ for the four-cell shape $(2,2)$ would give the wrong scale. The theorem does not describe the set of standard tableaux or give a direct bijection from tableaux to hook choices; it gives only the cardinality. The formula is surprising because it expresses a global linear-extension count as a product of local hook lengths, and it is the most efficient way to compute $f_\lambda$ for a fixed shape.
[example: Standard Tableaux of Shape Four Two One]
Let $\lambda=(4,2,1)$, so $r=|\lambda|=4+2+1=7$. Its conjugate partition is $\lambda'=(3,2,1,1)$, so the hook lengths $h(i,j)=\lambda_i-j+\lambda'_j-i+1$ are
\begin{align*}
h(1,1)&=4-1+3-1+1=6,&
h(1,2)&=4-2+2-1+1=4,\\
h(1,3)&=4-3+1-1+1=2,&
h(1,4)&=4-4+1-1+1=1,\\
h(2,1)&=2-1+3-2+1=3,&
h(2,2)&=2-2+2-2+1=1,\\
h(3,1)&=1-1+3-3+1=1.
\end{align*}
By the *Frame Robinson Thrall Hook Length Formula*,
\begin{align*}
f_{(4,2,1)}
&=\frac{7!}{6\cdot4\cdot2\cdot1\cdot3\cdot1\cdot1}\\
&=\frac{5040}{6\cdot4\cdot2\cdot3}\\
&=\frac{5040}{144}\\
&=35.
\end{align*}
Thus there are $35$ standard Young tableaux of shape $(4,2,1)$.
[/example]
## Consequences for Schur Enumeration
The formulas in this chapter should be read as a dictionary between algebra and enumeration, continuing the course's pattern of proving identities in $\operatorname{Sym}$ before extracting finite counting statements by specialization. Evaluating a Schur function at $1^n$ counts bounded semistandard tableaux, taking a leading coefficient counts standard tableaux, and principal specialization refines both by remembering a $q$-weight.
[explanation: Three Levels of Specialization]
The full Schur function $s_\lambda(x_1,x_2,\dots)$ records the complete content of a semistandard tableau: each entry $i$ contributes a factor $x_i$. The finite specialization $s_\lambda(1^n)$ forgets the content but keeps the restriction that all entries are at most $n$. The principal specialization $s_\lambda(1,q,q^2,\dots,q^{n-1})$ lies between these two extremes by recording the sum of the shifted entries.
The hook-content formula is obtained from the principal specialization by setting $q=1$. The hook-length formula is then obtained from the hook-content formula by keeping only the leading term as $n$ grows. This chain of specializations is a model for a recurring technique in algebraic combinatorics: prove a strong symmetric-function identity first, then extract counting formulas by specialization.
[/explanation]
The specialization chain also explains why several rational-looking products in this chapter are integers. The next remark records this point because it is a useful check on computations with hooks and contents.
[remark: Integrality Hidden in Products]
Neither
\begin{align*}
\prod_{u\in\lambda}\frac{n+c(u)}{h(u)}
\end{align*}
nor
\begin{align*}
\frac{r!}{\prod_{u\in\lambda}h(u)}
\end{align*}
looks integral from its product form. The tableau interpretations prove integrality by identifying these quantities with finite sets.
[/remark]
Once specializations produce concrete counting formulas, it is natural to ask for a more systematic operator calculus behind them. Chapter 10 develops generating methods and symmetric-function operators that package the earlier identities into a compact algebraic language.
# 10. Operators and Generating Methods
The previous chapters built Schur functions from tableaux and determinants, then developed the Cauchy identity, the Pieri rules, the Littlewood-Richardson rule, and specialization formulas. This chapter assumes that background, together with the homogeneous grading of $\operatorname{Sym}_\mathbb Q$, the Schur basis, the complete, elementary, and power-sum bases, and the Hall inner product on Schur functions. The new viewpoint is operator-theoretic: instead of treating symmetric functions only as elements of a ring, we also treat them as linear maps acting on that ring.
The main obstruction is that coefficient identities become unwieldy when every calculation is carried out by expanding products directly in the Schur basis. Multiplication, adjunction under the Hall inner product, skewing, and plethystic substitution provide a way to move factors across pairings, remove shapes instead of adding them, and rewrite generating functions by changing alphabets. Once the Cauchy kernel is understood as a reproducing object, skew Schur functions become adjoint multiplication operators, and plethystic notation becomes a controlled language for substitutions such as $X+Y$, $X-Y$, and $XY$.
## Multiplication and Adjoint Operators
Multiplication by a symmetric function is the most basic operation on $\operatorname{Sym}_\mathbb Q$, but multiplication alone only moves degrees upward and gives no intrinsic way to undo a factor. The Hall inner product supplies the missing duality: it lets a multiplier be transported from one side of a pairing to the other. The guiding problem is: given $f,g,h \in \operatorname{Sym}_\mathbb Q$, can the factor $f$ in $fg$ be moved from the left entry of an inner product to the right entry as an operator on $h$?
[remark: Schur-Orthonormal Form Of The Hall Inner Product]
We continue using the Hall inner product from Chapter 3. By the self-duality consequence of the Schur Cauchy identity, the same pairing may be recognized by
\begin{align*}
\langle s_\lambda, s_\mu \rangle = \delta_{\lambda\mu}
\end{align*}
for all partitions $\lambda$ and $\mu$.
[/remark]
This is the inner product for which the Schur basis is orthonormal. It is also the inner product under which the power-sum basis is orthogonal:
\begin{align*}
\langle p_\lambda,p_\mu\rangle=z_\lambda\delta_{\lambda\mu}.
\end{align*}
This orthogonality fact is used later to compute explicit adjoints. To use the pairing as an operator calculus, we first isolate multiplication as a named linear map.
[definition: Multiplication Operator]
For $f \in \operatorname{Sym}_\mathbb Q$, the multiplication operator is the $\mathbb Q$-linear map
\begin{align*}
M_f: \operatorname{Sym}_\mathbb Q \to \operatorname{Sym}_\mathbb Q, \qquad M_f(g)=fg.
\end{align*}
[/definition]
Multiplication raises degree by $\deg f$, so a direct inverse is usually impossible: multiplication by $f$ need not be injective on every quotient or preserve a chosen basis in a triangular way. What survives canonically is the transpose operation determined by the Hall pairing. The notation $f^\perp$ records this degree-lowering operator, and its definition is chosen so that moving $f$ across the Hall pairing becomes a formal identity.
[definition: Adjoint Operator]
For $f \in \operatorname{Sym}_\mathbb Q$, the adjoint operator is the $\mathbb Q$-linear map
\begin{align*}
f^\perp: \operatorname{Sym}_\mathbb Q \to \operatorname{Sym}_\mathbb Q
\end{align*}
satisfying
\begin{align*}
\langle fg,h\rangle = \langle g,f^\perp h\rangle
\end{align*}
for all $g,h \in \operatorname{Sym}_\mathbb Q$.
[/definition]
The existence and uniqueness of $f^\perp$ follow from the nondegeneracy of the Hall inner product on each homogeneous component. The definition is often more useful than an explicit formula because it turns coefficient extraction into an inner product computation, and the next result records the basic degree bookkeeping.
[quotetheorem:5214]
[citeproof:5214]
This formula is the algebraic reason that adjoint operators are lowering operators. The homogeneity hypothesis in the degree statement is needed: if $f=h_1+h_2$, then $f^\perp$ has components lowering degree by $1$ and by $2$, so it does not send $\operatorname{Sym}_{\mathbb Q,n}$ into a single homogeneous component. The theorem does not give an explicit expansion of $f^\perp h$ in a preferred basis; it only gives the adjointness relation and degree support. To use the general construction in computations, the following definition names the adjoints associated with the standard symmetric-function generators.
[definition: Classical Perp Operators]
For $r \ge 0$, the classical perp operators are the $\mathbb Q$-linear maps
\begin{align*}
h_r^\perp,e_r^\perp,p_r^\perp: \operatorname{Sym}_\mathbb Q \to \operatorname{Sym}_\mathbb Q
\end{align*}
defined as the adjoints of multiplication by $h_r$, $e_r$, and $p_r$, respectively.
[/definition]
The operators $h_r^\perp$ and $e_r^\perp$ should be compared with the Pieri rules for multiplying Schur functions by $h_r$ and $e_r$. Multiplication by $h_r$ or $e_r$ adds strips to a Schur function, so adjointness asks for the reverse operation under the Hall inner product: which Schur functions pair nontrivially after a strip has been added? This turns the computational problem into a removal rule. The following theorem gives the precise Schur-basis expansion by identifying the shapes obtained from a partition after removing the corresponding horizontal or vertical strips.
[quotetheorem:5215]
[citeproof:5215]
The adjoint rules are the reverse direction of the Pieri rules, not a new independent positivity phenomenon. Their hypotheses keep the removal operation inside genuine skew diagrams: horizontal removals are dual to adding rows, and vertical removals are dual to adding columns. This distinction is essential because the Hall inner product detects Schur coefficients exactly; if the removed boxes violate the strip condition, the corresponding multiplication coefficient is zero and the adjoint has no matching term. The theorem is also limited to adjoints of one-row and one-column Schur functions, while general skewing by $s_\lambda^\perp$ requires the full Littlewood-Richardson coefficients introduced earlier.
[illustration:horizontal-two-strip-removals-421]
[example: Removing A Horizontal Strip]
Let $\nu=(4,2,1)$. By the *Pieri Adjoint Rules*,
\begin{align*}
h_2^\perp s_{421}
&=\sum_{\lambda:\ (4,2,1)/\lambda\text{ is a horizontal }2\text{-strip}} s_\lambda.
\end{align*}
Such a $\lambda$ must be a partition of $7-2=5$ contained in $(4,2,1)$. The partitions of $5$ contained in $(4,2,1)$ are
\begin{align*}
(2,2,1),\qquad (3,1,1),\qquad (3,2),\qquad (4,1).
\end{align*}
For these four partitions, the removed boxes occupy the following columns:
\begin{align*}
(4,2,1)/(2,2,1)&:\text{ columns }3,4,\\
(4,2,1)/(3,1,1)&:\text{ columns }4,2,\\
(4,2,1)/(3,2)&:\text{ columns }4,1,\\
(4,2,1)/(4,1)&:\text{ columns }2,1.
\end{align*}
In each case the two removed boxes lie in distinct columns, so each difference is a horizontal $2$-strip. Therefore
\begin{align*}
h_2^\perp s_{421}
&=s_{221}+s_{311}+s_{32}+s_{41}.
\end{align*}
This is the strip-removal form of Pieri: the adjoint to multiplication by $h_2$ removes two boxes, subject to the condition that no two removed boxes lie in the same column.
[/example]
The Pieri adjoints are governed by Schur product rules, but the power-sum adjoint is governed by the diagonal form of the Hall inner product. Because $\operatorname{Sym}_\mathbb Q$ is a polynomial algebra in the $p_r$, this motivates the following theorem, which writes the resulting operator as a partial derivative.
[quotetheorem:5192]
[citeproof:5192]
This derivation formula explains why the adjoint operators are efficient coefficient extractors in generating-function arguments. The restriction to $\operatorname{Sym}_\mathbb Q$ is necessary because the formula differentiates with respect to the algebraically independent generators $p_r$ and uses the scalar $r$ in the coefficient field. A concrete obstruction over $\mathbb Z$ is already visible in degree $2$:
\begin{align*}
h_2=\frac{p_1^2+p_2}{2}.
\end{align*}
Thus the integral symmetric-function ring is not the polynomial ring $\mathbb Z[p_1,p_2,\dots]$, so the phrase $\partial/\partial p_2$ does not define an intrinsic integral-coordinate derivation on the integral form. A direct test shows the obstruction: if the formula were an integral-coordinate derivation, then applying $2\partial/\partial p_2$ to $h_2$ would require differentiating the expression $(p_1^2+p_2)/2$, which uses the coefficient $1/2$ outside $\mathbb Z$. The genuine adjoint is still defined by the Hall pairing and gives $p_2^\perp h_2=1$, but that value is obtained only after passing through the rational power-sum coordinates. The result does not claim that $p_r^\perp$ is a derivation on every integral form of symmetric functions; it is a differential operator in the rational power-sum coordinates. The same perspective becomes more geometric when the multiplier is a Schur function.
## Skewing Operators and Skew Schur Functions
Skew Schur functions were introduced as generating functions of semistandard tableaux of skew shape. If one computes them only by enumerating tableaux, the connection with multiplication rules is hidden and coefficient extraction has to be repeated shape by shape. The operator question is: does the skew shape $\nu/\lambda$ already appear inside the ring structure of $\operatorname{Sym}_\mathbb Q$ without building tableaux from scratch?
[definition: Skewing By A Symmetric Function]
For $f \in \operatorname{Sym}_\mathbb Q$, skewing by $f$ is the $\mathbb Q$-linear operator
\begin{align*}
f^\perp: \operatorname{Sym}_\mathbb Q \to \operatorname{Sym}_\mathbb Q
\end{align*}
adjoint to multiplication by $f$ under the Hall inner product.
[/definition]
When $f=s_\lambda$, the notation $s_\lambda^\perp$ should recover a familiar object rather than merely introduce a new symbol. The way to test this is to compare Schur coefficients, because those coefficients are already controlled by the Littlewood-Richardson rule. This comparison identifies skewing by $s_\lambda$ with the skew Schur function of shape $\nu/\lambda$.
[quotetheorem:5216]
[citeproof:5216]
This theorem turns combinatorial skew shapes into algebraic operators. The containment hypothesis is not cosmetic: if $\lambda=(2,1)$ and $\nu=(2)$, then there is no Young diagram difference $\nu/\lambda$, and degree considerations force $s_\lambda^\perp s_\nu=0$. The theorem does not provide a closed positive formula for the Schur expansion by itself; that positivity still enters through the Littlewood-Richardson rule or through tableaux. It also lets computations move between three interchangeable languages: adjointness, Littlewood-Richardson coefficients, and skew tableaux.
[example: Computing s_{21} Perp s_{432}]
We compute $s_{21}^\perp s_{432}$. By *Skew Schur Functions From Adjoint Operators*,
\begin{align*}
s_{21}^\perp s_{432}=s_{432/21}.
\end{align*}
Write the six boxes of $432/21$ as
\begin{align*}
\begin{array}{cc}
a&b\\
c&d\\
e&f
\end{array}
\end{align*}
where the actual columns are $(3,4)$ in the first row, $(2,3)$ in the second row, and $(1,2)$ in the third row. Semistandardness gives
\begin{align*}
a\le b,\qquad c\le d,\qquad e\le f,\qquad a<d,\qquad c<f,
\end{align*}
and the Littlewood-Richardson reading word is
\begin{align*}
b,a,d,c,f,e.
\end{align*}
The lattice condition forces $b=1$, since the first letter of a lattice word must be $1$. Then $a\le b$ gives $a=1$. Since $a<d$, we have $d>1$; the third letter $d$ cannot exceed $2$, because the prefix $1,1,d$ would otherwise contain a $3$ before any $2$. Hence $d=2$.
Now $c\le d=2$. If $c=1$, then $f>1$, and the fifth letter $f$ can be $2$ or $3$. This gives
\begin{align*}
\begin{array}{cc}
1&1\\
1&2\\
1&2
\end{array}
&\quad\text{of content }(4,2),\\
\begin{array}{cc}
1&1\\
1&2\\
2&2
\end{array}
&\quad\text{of content }(3,3),\\
\begin{array}{cc}
1&1\\
1&2\\
1&3
\end{array}
&\quad\text{of content }(4,1,1),\\
\begin{array}{cc}
1&1\\
1&2\\
2&3
\end{array}
&\quad\text{of content }(3,2,1).
\end{align*}
If $c=2$, then $f>2$, and the fifth letter must be $3$; the allowed choices for $e$ are the ones preserving the lattice condition, namely $e=1$ or $e=3$. These give
\begin{align*}
\begin{array}{cc}
1&1\\
2&2\\
1&3
\end{array}
&\quad\text{of content }(3,2,1),\\
\begin{array}{cc}
1&1\\
2&2\\
3&3
\end{array}
&\quad\text{of content }(2,2,2).
\end{align*}
Thus the Littlewood-Richardson tableaux of shape $432/21$ have contents
\begin{align*}
(4,2),\qquad (4,1,1),\qquad (3,3),\qquad (3,2,1),\qquad (3,2,1),\qquad (2,2,2).
\end{align*}
Therefore
\begin{align*}
s_{21}^\perp s_{432}
=s_{432/21}
=s_{42}+s_{411}+s_{33}+2s_{321}+s_{222}.
\end{align*}
The coefficient $2$ appears exactly because there are two lattice semistandard tableaux of content $(3,2,1)$.
[/example]
The example illustrates the practical value of skewing operators: the operator notation is compact, while the Schur expansion remembers the tableau count. For products, one needs a rule for how a single skewing operator distributes across two factors. The answer is controlled by splitting the skewing Schur function through Littlewood-Richardson coefficients.
[quotetheorem:5217]
[citeproof:5217]
This formula is the operator version of cutting a skew shape across a product. The sum over $\alpha,\beta$ is constrained by $c_{\alpha\beta}^{\lambda}$; replacing it by a sum over all subpartitions would overcount. For instance, take $\lambda=(2,2)$, $\mu=(2)$, and $\nu=(1,1)$. Since
\begin{align*}
s_2s_{11}=s_{31}+s_{211},
\end{align*}
there is no $s_{22}$ term, so $s_{22}^\perp(s_2s_{11})=0$ in degree $0$. A naive subpartition rule would include the pair $\alpha=(2)$ and $\beta=(1,1)$, contributing
\begin{align*}
(s_2^\perp s_2)(s_{11}^\perp s_{11})=1,
\end{align*}
even though $c_{2,11}^{22}=0$. The theorem does not say that $s_\lambda^\perp$ is an ordinary derivation: product splitting is governed by the Littlewood-Richardson coproduct, not by a two-term Leibniz rule. It prepares the ground for Cauchy-kernel arguments, where adjointness becomes a reproducing principle.
## The Cauchy Kernel as a Reproducing Kernel
The Cauchy identity from Chapter 6 is more than a generating function. A naive kernel with arbitrary coefficients in front of $s_\lambda[X]s_\lambda[Y]$ would not reproduce functions under the Hall inner product; the coefficients must match the dual basis for the pairing. The question here is: why does multiplying by the Cauchy kernel reproduce symmetric functions through the Hall inner product?
[definition: Cauchy Kernel]
For two alphabets $X=(x_1,x_2,\dots)$ and $Y=(y_1,y_2,\dots)$, the Cauchy kernel is
\begin{align*}
\Omega[X,Y]=\prod_{i,j}\frac{1}{1-x_i y_j}.
\end{align*}
[/definition]
The Schur Cauchy identity rewrites this infinite product as a diagonal sum over partitions. In completed symmetric functions, this is the statement
\begin{align*}
\Omega[X,Y]=\sum_\lambda s_\lambda[X]s_\lambda[Y].
\end{align*}
This diagonal expansion leads to the following theorem: pairing against the $X$ side reproduces the same symmetric function written in the $Y$ alphabet.
[quotetheorem:5218]
[citeproof:5218]
This is the symmetric-function analogue of a reproducing kernel in linear algebra: the kernel stores a copy of every basis vector, paired with its dual basis vector. The use of the Schur basis and Hall inner product is essential; changing the pairing without changing the dual basis in the kernel would leave extra Gram-matrix factors instead of reproducing $f[Y]$. For example, if a modified pairing satisfies $(s_\lambda,s_\mu)'=a_\lambda\delta_{\lambda\mu}$ with $a_\lambda \ne 1$, then the same kernel gives $(s_\mu[X],\Omega[X,Y])'_X=a_\mu s_\mu[Y]$, not $s_\mu[Y]$. To reproduce under that pairing, the kernel would have to use the dual coefficients $a_\lambda^{-1}s_\lambda[X]s_\lambda[Y]$. The theorem does not assert analytic convergence of the infinite product, since the identity is interpreted in the completed graded ring. The next natural question is what happens when a lowering operator acts on the kernel itself, rather than only after pairing.
[quotetheorem:5219]
[citeproof:5219]
The kernel identity converts an operator in the $X$ variables into multiplication in the $Y$ variables. The distinction between $X$ and $Y$ is necessary: $f[Y]$ is scalar with respect to the $X$-inner product, while $f[X]$ would act on the same variables being paired. The theorem does not say that $f^\perp$ is multiplication on arbitrary symmetric functions; it becomes multiplication only when applied to the Cauchy kernel. Applying this to the Pieri adjoints recovers the same strip-removal rules, now packaged as a single generating-function identity.
[example: Extracting Complete Functions From The Kernel]
Take $f=h_r$ in *Adjoint Operators On The Cauchy Kernel*. Then
\begin{align*}
h_r^\perp{}_X\Omega[X,Y]=h_r[Y]\Omega[X,Y].
\end{align*}
Using the Schur expansion $\Omega[X,Y]=\sum_\lambda s_\lambda[X]s_\lambda[Y]$ and applying $h_r^\perp$ only in the $X$ variables gives
\begin{align*}
h_r^\perp{}_X\Omega[X,Y]
&=h_r^\perp{}_X\sum_\lambda s_\lambda[X]s_\lambda[Y]\\
&=\sum_\lambda \bigl(h_r^\perp s_\lambda[X]\bigr)s_\lambda[Y]\\
&=\sum_\lambda\left(\sum_{\mu:\lambda/\mu\text{ is a horizontal }r\text{-strip}}s_\mu[X]\right)s_\lambda[Y]\\
&=\sum_{\mu,\lambda:\lambda/\mu\text{ is a horizontal }r\text{-strip}}s_\mu[X]s_\lambda[Y],
\end{align*}
where the third equality is the *Pieri Adjoint Rules*.
On the other hand, multiplying the Cauchy kernel by $h_r[Y]$ gives
\begin{align*}
h_r[Y]\Omega[X,Y]
&=h_r[Y]\sum_\mu s_\mu[X]s_\mu[Y]\\
&=\sum_\mu s_\mu[X]\bigl(h_r[Y]s_\mu[Y]\bigr)\\
&=\sum_\mu s_\mu[X]\left(\sum_{\lambda:\lambda/\mu\text{ is a horizontal }r\text{-strip}}s_\lambda[Y]\right)\\
&=\sum_{\mu,\lambda:\lambda/\mu\text{ is a horizontal }r\text{-strip}}s_\mu[X]s_\lambda[Y],
\end{align*}
where the third equality is the ordinary Pieri rule for multiplication by $h_r$. The two expansions have the same terms: removing a horizontal $r$-strip from the $X$-side Schur function is adjoint to adding a horizontal $r$-strip on the $Y$-side.
[/example]
The kernel viewpoint is often the shortest route to identities involving infinitely many simultaneous skewings. Plethystic notation gives a concise way to express the alphabets appearing in such identities.
## Plethystic Notation and Alphabets
Generating functions in symmetric function theory often need substitutions such as replacing an alphabet $X$ by $X+Y$, by $X-Y$, or by a scalar multiple $tX$. Writing these substitutions variable by variable breaks down for virtual alphabets such as $X-Y$, where there is no actual list of variables obtained by deleting the variables of $Y$ from those of $X$. The problem is to create notation that remembers how power sums transform, since the power sums freely generate $\operatorname{Sym}_\mathbb Q$.
[definition: Plethystic Substitution]
Let $R$ be a commutative $\mathbb Q$-algebra and let $A$ be a formal alphabet specified by elements $p_r[A] \in R$ for all $r \ge 1$. The plethystic substitution determined by $A$ is the $\mathbb Q$-algebra homomorphism
\begin{align*}
\operatorname{ev}_A: \operatorname{Sym}_\mathbb Q=\mathbb Q[p_1,p_2,\dots] \to R
\end{align*}
defined by $\operatorname{ev}_A(p_r)=p_r[A]$. For $f \in \operatorname{Sym}_\mathbb Q$, write
\begin{align*}
f[A] := \operatorname{ev}_A(f).
\end{align*}
[/definition]
This definition places all responsibility on the power sums. For an ordinary alphabet $X=(x_1,x_2,\dots)$, it gives $p_r[X]=\sum_i x_i^r$. To use plethystic notation in Cauchy identities, we need formal rules for combining alphabets even when no literal finite list of variables is present.
[definition: Sum And Difference Of Alphabets]
For alphabets $A$ and $B$, their plethystic sum and difference are determined by
\begin{align*}
p_r[A+B] &= p_r[A]+p_r[B], \\
p_r[A-B] &= p_r[A]-p_r[B]
\end{align*}
for all $r \ge 1$.
[/definition]
The minus sign in $A-B$ is plethystic subtraction, not deletion of variables. The practical question is how to compute complete symmetric functions after such a formal operation, since $A-B$ need not be represented by an actual list of variables. The generating series $H[-;t]$ is the right test object because its logarithm is governed by the power sums, so additive rules for alphabets become multiplicative rules for series.
[quotetheorem:5220]
[citeproof:5220]
The theorem reduces computations with $h_r[A\pm B]$ to ordinary multiplication of generating functions. The power-sum definition of $A-B$ is essential: treating $A-B$ as literal set subtraction would give the wrong answer whenever $A$ and $B$ are independent alphabets. The theorem does not say that every symmetric function satisfies a simple quotient rule under subtraction; the clean quotient is a special feature of the complete generating series. A low-degree expansion shows how the sign in a negative alphabet turns complete functions in $Y$ into elementary functions in $Y$.
[example: Subtracting An Alphabet]
Let $X$ and $Y$ be alphabets. In degree at most $2$, the complete generating series for the virtual alphabet $X-Y$ is obtained from
\begin{align*}
H[X-Y;t]
&=\frac{H[X;t]}{H[Y;t]}
=\frac{\prod_j(1-y_jt)}{\prod_i(1-x_it)}.
\end{align*}
The numerator expands as
\begin{align*}
\prod_j(1-y_jt)
&=1-\left(\sum_j y_j\right)t+\left(\sum_{j<k}y_jy_k\right)t^2+\text{terms of degree at least }3\text{ in }t\\
&=1-e_1[Y]t+e_2[Y]t^2+\text{terms of degree at least }3\text{ in }t.
\end{align*}
The denominator inverse is the complete generating series in $X$, so
\begin{align*}
\frac{1}{\prod_i(1-x_it)}
&=H[X;t]\\
&=1+h_1[X]t+h_2[X]t^2+\text{terms of degree at least }3\text{ in }t.
\end{align*}
Multiplying the two truncated series and collecting the coefficient of $t^2$ gives
\begin{align*}
H[X-Y;t]
&=\bigl(1-e_1[Y]t+e_2[Y]t^2+\cdots\bigr)
\bigl(1+h_1[X]t+h_2[X]t^2+\cdots\bigr),\\
[t^2]H[X-Y;t]
&=h_2[X]-e_1[Y]h_1[X]+e_2[Y].
\end{align*}
Since $e_1[Y]=h_1[Y]$ and $[t^2]H[X-Y;t]=h_2[X-Y]$, this gives
\begin{align*}
h_2[X-Y]=h_2[X]-h_1[X]h_1[Y]+e_2[Y].
\end{align*}
Thus the negative alphabet contributes the elementary term $e_2[Y]$, with the signs determined by the product $\prod_j(1-y_jt)$.
[/example]
Plethystic notation also compresses the Cauchy kernel. If $XY$ denotes the alphabet consisting formally of all products $x_i y_j$, and $H[A;t]=\sum_{r\ge0}h_r[A]t^r$ is the complete generating series from the preceding theorem, then
\begin{align*}
\Omega[X,Y]=H[XY;1].
\end{align*}
The power sums satisfy $p_r[XY]=p_r[X]p_r[Y]$, so the exponential form becomes
\begin{align*}
\Omega[X,Y]=\exp\left(\sum_{r\ge 1}\frac{p_r[X]p_r[Y]}{r}\right).
\end{align*}
[remark: Plethysm Versus Evaluation]
Plethystic notation is a rule for substituting alphabets into symmetric functions, not ordinary evaluation of a polynomial at a tuple. The notation is reliable when the action on power sums has been specified. This is why expressions such as $f[X-Y]$ and $f[XY]$ are meaningful even when there is no literal list of variables being substituted.
[/remark]
The operator viewpoint and plethystic viewpoint now reinforce each other. Adjoint operators describe how factors move across the Hall pairing, while plethystic alphabets describe how generating kernels change under substitution. This is the same structural pattern that appears in Fock-space constructions: multiplication operators behave like creation operators, their Hall adjoints behave like annihilation operators, and the Cauchy kernel packages the resulting duality. Together these tools provide the compact language used in Chapter 11 for Schur positivity and transition-coefficient questions.
With operators and generating series in hand, the remaining task is to compare how different bases transform into one another and what those transformations imply about positivity. Chapter 11 takes up transition coefficients across the monomial, complete, elementary, and Schur bases, using the tools built throughout the course to study positivity phenomena.
# 11. Transition Coefficients and Positivity Phenomena
This chapter studies the numerical coefficients that appear when symmetric functions are moved between the monomial, complete, elementary, and Schur bases. Earlier chapters built Schur functions through tableaux in Chapter 4, determinants in Chapter 5, Cauchy identities in Chapter 6, and product rules in Chapters 7 and 8; here the emphasis shifts to what those constructions say about positivity. The guiding problem is to recognise when a symmetric function is a nonnegative combination of Schur functions, and to identify structural reasons why such nonnegative coefficients appear.
## Kostka Numbers and Dominance Triangularity
The first question is how a Schur function looks in the monomial basis. Since Schur functions are defined by summing weights of semistandard tableaux, the coefficient of a monomial symmetric function should count tableaux with a prescribed content; the main point is that these numbers are arranged in a triangular matrix when partitions are ordered by dominance.
[definition: Kostka Number]
Let $\lambda$ and $\mu$ be partitions of $n$. The Kostka number $K_{\lambda\mu}$ is the number of semistandard Young tableaux of shape $\lambda$ and content $\mu$.
[/definition]
Thus $K_{\lambda\mu}$ counts fillings of the Young diagram of $\lambda$ whose entries are weakly increasing along rows, strictly increasing down columns, and in which the entry $i$ occurs $\mu_i$ times. This definition is useful because it gives the monomial expansion of Schur functions without further algebraic manipulation.
[quotetheorem:5196]
[citeproof:5196]
The expansion depends on using the semistandard tableau definition of $s_\lambda$ and on grouping weights by partition content. If the ordered exponent vector were used instead, the coefficient would no longer be indexed by partitions: for $s_2$ in two variables, the tableaux of shape $(2)$ contribute $x_1^2$, $x_1x_2$, and $x_2^2$, and the middle term has exponent vector $(1,1)$ while the first and last have ordered vectors $(2,0)$ and $(0,2)$. Grouping by partition is what turns these ordered weights into the symmetric expression $m_2+m_{11}$. The condition that $\mu$ has the same size as $\lambda$ is also necessary; no tableau of shape $\lambda$ can have total content different from $|\lambda|$, so no monomial symmetric function of another degree appears. Finally, the theorem does not say that the inverse change-of-basis matrix is nonnegative: in degree $3$, the inverse identity $m_{21}=s_{21}-2s_{111}$ gives an explicit negative coefficient. To understand which entries can occur and why the matrix can be inverted degree by degree, we need the partial order that records which partitions are more top-heavy.
[definition: Dominance Order]
Let $\lambda$ and $\mu$ be partitions of $n$. We say that $\lambda$ dominates $\mu$, written $\lambda \unrhd \mu$, if
\begin{align*}
\lambda_1+\cdots+\lambda_k \ge \mu_1+\cdots+\mu_k
\end{align*}
for every $k \ge 1$, where missing parts are treated as $0$.
[/definition]
Dominance is the order naturally detected by semistandard tableaux: the first $k$ rows can contain only a limited supply of entries larger than $k$, while the first $k$ values must fit into the upper part of the shape. The next theorem turns that geometric constraint into an algebraic triangularity statement for the Schur-to-monomial transition matrix. It is needed because positivity alone would not imply that the Schur functions form a basis or that basis conversion can be performed recursively.
[quotetheorem:5197]
[citeproof:5197]
The dominance hypothesis is essential: for $\lambda=(21)$ and $\mu=(3)$, the first dominance inequality fails because $2<3$, and there is no semistandard tableau of shape $(21)$ with three entries all equal to $1$ due to the strict column condition. The statement also does not claim that $K_{\lambda\mu}=1$ whenever $\lambda \unrhd \mu$; for example $K_{(21),(111)}=2$. After ordering partitions by any linear extension of dominance, the matrix $(K_{\lambda\mu})$ is upper triangular with diagonal entries $1$, so the Schur functions form a basis whenever the monomial symmetric functions do. The examples below use this triangularity to expose where Schur positivity can fail even when monomial data looks positive.
[example: Degree Three Kostka Matrix]
In degree $3$, order the partitions as $(3) \unrhd (21) \unrhd (111)$. For the one-row shape $(3)$, a semistandard tableau is just a weakly increasing row of length $3$, so the possible partition contents are represented by
\begin{align*}
111,\qquad 112,\qquad 123.
\end{align*}
Thus
\begin{align*}
s_3=m_3+m_{21}+m_{111}.
\end{align*}
For shape $(21)$, write the entries as
\begin{align*}
\begin{array}{cc}
a & b\\
c&
\end{array}
\end{align*}
with semistandard conditions $a\le b$ and $a<c$. Content $(3)$ is impossible, since three equal entries would force $a=c$. For content $(21)$, the entries are $1,1,2$, and the conditions force
\begin{align*}
a=1,\qquad c=2,\qquad b=1,
\end{align*}
so there is one tableau. For content $(111)$, the entries are $1,2,3$. Since $a<c$ and $a\le b$, the entry $a$ must be $1$. The remaining entries $2,3$ can be placed in the two positions $(b,c)$ as
\begin{align*}
(b,c)=(2,3)\qquad\text{or}\qquad (b,c)=(3,2),
\end{align*}
and both satisfy $a\le b$ and $a<c$. Hence
\begin{align*}
s_{21}=m_{21}+2m_{111}.
\end{align*}
For the vertical shape $(111)$, strict increase down the column forces the unique filling
\begin{align*}
\begin{array}{c}
1\\
2\\
3
\end{array}
\end{align*}
with content $(111)$, so
\begin{align*}
s_{111}=m_{111}.
\end{align*}
With rows and columns both ordered as $(3),(21),(111)$, the Kostka matrix is therefore
\begin{align*}
\begin{pmatrix}
1&1&1\\
0&1&2\\
0&0&1
\end{pmatrix},
\end{align*}
which is unitriangular. Solving the three equations from the bottom upward gives
\begin{align*}
m_{111}&=s_{111},\\
s_{21}&=m_{21}+2m_{111}=m_{21}+2s_{111},
\end{align*}
so
\begin{align*}
m_{21}=s_{21}-2s_{111}.
\end{align*}
Finally,
\begin{align*}
s_3&=m_3+m_{21}+m_{111}\\
&=m_3+(s_{21}-2s_{111})+s_{111}\\
&=m_3+s_{21}-s_{111},
\end{align*}
and therefore
\begin{align*}
m_3=s_3-s_{21}+s_{111}.
\end{align*}
The inverse transition already contains subtraction, even though every Kostka number in the Schur-to-monomial direction is nonnegative.
[/example]
This example already warns that monomial positivity and Schur positivity are different notions. A symmetric function with positive monomial coefficients can still require subtraction when expressed in the Schur basis.
## Schur Positivity Without Hall-Littlewood Theory
The next problem is to decide whether a given symmetric function belongs to the nonnegative cone spanned by Schur functions. We avoid Hall-Littlewood polynomials in this chapter and use only the changes of basis from Chapters 2 and 4, the tableaux from Chapter 4, and the product rules from Chapters 7 and 8.
[definition: Schur Positive Symmetric Function]
Let $\operatorname{Sym}_n$ denote the degree-$n$ homogeneous component of the ring of symmetric functions over $\mathbb{Z}$. A homogeneous symmetric function $f \in \operatorname{Sym}_n$ is Schur positive if there is a coefficient assignment
\begin{align*}
\{ \lambda : \lambda \vdash n \} &\longrightarrow \mathbb{Z}_{\ge 0},&
\lambda &\longmapsto c_\lambda
\end{align*}
such that
\begin{align*}
f = \sum_{\lambda \vdash n} c_\lambda s_\lambda.
\end{align*}
[/definition]
Schur positivity is stronger than having nonnegative monomial coefficients, because the Schur basis itself has positive monomial expansions. The inverse Kostka matrix is therefore the basic computational tool for testing small cases.
[example: Testing Schur Positivity Of $h_2h_1-e_3$]
For one-row and one-column partitions, $h_2=s_2$, $h_1=s_1$, and $e_3=s_{111}$. Applying the *one-box Pieri rule* to $s_2s_1$ means adding one box to the diagram $(2)$ with no two added boxes in the same column; since only one box is being added, the possible resulting partitions are
\begin{align*}
(3)\qquad\text{and}\qquad (21).
\end{align*}
Thus
\begin{align*}
h_2h_1
&=s_2s_1\\
&=s_3+s_{21}.
\end{align*}
Substituting $e_3=s_{111}$ then gives
\begin{align*}
h_2h_1-e_3
&=(s_3+s_{21})-s_{111}\\
&=s_3+s_{21}-s_{111}.
\end{align*}
In the Schur basis, the coefficient of $s_3$ is $1$, the coefficient of $s_{21}$ is $1$, and the coefficient of $s_{111}$ is $-1$, so this symmetric function is not Schur positive.
[/example]
The calculation illustrates the preferred method: rewrite the expression in a basis where the requested positivity is visible. Sometimes the input is given in a mixed basis, and the Kostka matrix must be inverted.
[example: Testing Schur Positivity Of $s_{31}+s_{22}-m_{31}$]
In degree $4$, the monomial $m_{31}$ has to be recovered from the whole dominance-triangular block involving $(31),(22),(211),(1111)$, because the Schur expansion of $s_{31}$ also contains lower monomial terms. Counting semistandard tableaux of the four relevant shapes and contents gives
\begin{align*}
s_{31} &= m_{31}+m_{22}+2m_{211}+3m_{1111},\\
s_{22} &= m_{22}+m_{211}+2m_{1111},\\
s_{211} &= m_{211}+3m_{1111},\\
s_{1111} &= m_{1111}.
\end{align*}
We solve this system from the bottom upward. The last equation gives
\begin{align*}
m_{1111}=s_{1111}.
\end{align*}
Substituting this into the third equation gives
\begin{align*}
s_{211}&=m_{211}+3m_{1111}\\
&=m_{211}+3s_{1111},
\end{align*}
so
\begin{align*}
m_{211}=s_{211}-3s_{1111}.
\end{align*}
Substituting $m_{211}=s_{211}-3s_{1111}$ and $m_{1111}=s_{1111}$ into the second equation gives
\begin{align*}
s_{22}
&=m_{22}+m_{211}+2m_{1111}\\
&=m_{22}+(s_{211}-3s_{1111})+2s_{1111}\\
&=m_{22}+s_{211}-s_{1111},
\end{align*}
hence
\begin{align*}
m_{22}=s_{22}-s_{211}+s_{1111}.
\end{align*}
Now substitute these three expressions into the first equation:
\begin{align*}
s_{31}
&=m_{31}+m_{22}+2m_{211}+3m_{1111}\\
&=m_{31}+(s_{22}-s_{211}+s_{1111})+2(s_{211}-3s_{1111})+3s_{1111}\\
&=m_{31}+s_{22}-s_{211}+s_{1111}+2s_{211}-6s_{1111}+3s_{1111}\\
&=m_{31}+s_{22}+s_{211}-2s_{1111}.
\end{align*}
Therefore
\begin{align*}
m_{31}=s_{31}-s_{22}-s_{211}+2s_{1111}.
\end{align*}
Using this inverse expansion in the given function,
\begin{align*}
s_{31}+s_{22}-m_{31}
&=s_{31}+s_{22}-(s_{31}-s_{22}-s_{211}+2s_{1111})\\
&=s_{31}+s_{22}-s_{31}+s_{22}+s_{211}-2s_{1111}\\
&=2s_{22}+s_{211}-2s_{1111}.
\end{align*}
The Schur coefficient of $s_{1111}$ is $-2$, so the symmetric function is not Schur positive. The calculation shows that a single visible occurrence of $m_{31}$ in $s_{31}$ is not enough for basis conversion; the inverse triangular calculation must also subtract the lower monomial terms appearing between $(31)$ and $(1111)$.
[/example]
Computational tests are useful, but the main theory seeks structural sources of Schur positivity. The most important sources in this course are products of Schur functions, skew Schur functions, and Cauchy identities.
[remark: Positivity Depends On The Chosen Basis]
The phrase positive coefficient has meaning only after a basis is named. The same function may be positive in the monomial basis, signed in the Schur basis, and positive again in another basis. Much of symmetric function theory studies which bases reveal hidden enumerative or representation-theoretic structure.
[/remark]
This warning becomes important when using identities: an identity that has subtraction in one basis can encode a positive counting rule after a nontrivial change of basis.
## Products and Littlewood-Richardson Coefficients
The next question is what happens when two Schur-positive functions are multiplied. Since the Schur functions form a basis and the ring product respects degree, there are unique coefficients describing the product of two Schur functions.
[definition: Littlewood-Richardson Coefficient]
For partitions $\lambda$, $\mu$, and $\nu$ with $|\nu|=|\lambda|+|\mu|$, the Littlewood-Richardson coefficient $c_{\lambda\mu}^{\nu}$ is defined by
\begin{align*}
s_\lambda s_\mu = \sum_{\nu} c_{\lambda\mu}^{\nu}s_\nu.
\end{align*}
[/definition]
The definition gives the coefficients algebraically, but it does not yet explain why they should be nonnegative. To prove positivity, the course replaces the unknown Schur coefficient by a concrete set of skew tableaux. This is the point at which the lattice-word condition becomes valuable: it selects exactly the tableaux whose rectification records the second factor.
[quotetheorem:5183]
[citeproof:5183]
The skew-shape hypothesis matters: if $\lambda$ is not contained in $\nu$, then $\nu/\lambda$ is not a diagram and there is no tableau set to count, while the coefficient $c_{\lambda\mu}^{\nu}$ is forced to be $0$ by degree or shape constraints in many such cases. The lattice-word condition is also a real restriction rather than decoration; semistandard skew tableaux with the right content may exist but fail the lattice condition and therefore do not contribute to $c_{\lambda\mu}^{\nu}$. The theorem does not give a subtraction-free formula for every Schur coefficient in an arbitrary symmetric function; it only handles Schur products and the related skew expansions. Its immediate consequence is the positivity phenomenon that motivates much of the chapter: products of Schur functions have nonnegative Schur expansions.
The rule is therefore both a formula and a warning about scope. It explains positivity for products in the Schur basis, but it does not make every basis transition positive and it does not remove the need to check the skew shape and content conditions. Later uses of the rule rely on exactly this precision: the same coefficient can be read algebraically as a product coefficient, geometrically as a multiplicity, or combinatorially as a count of lattice-word tableaux.
[quotetheorem:5221]
[citeproof:5221]
Both hypotheses are needed. If $f$ is not Schur positive, multiplication cannot repair the sign in general: taking $g=1$ leaves $fg=f$, so $s_3-s_{21}$ remains non-Schur-positive. Homogeneity keeps the statement inside a fixed graded component where Schur expansions are finite and degree comparisons are meaningful; for finite inhomogeneous sums the same argument applies degree by degree, but the theorem as stated is the homogeneous version used in the course. The result also says nothing about positivity in other bases, since Schur-positive products can have signed expansions after applying an inverse transition matrix. It explains why Pieri rules are positive: they are special cases of the Littlewood-Richardson rule where one factor is a one-row or one-column Schur function.
[example: Product $s_{21}s_{1}$]
By the *one-box Pieri rule*, multiplying $s_{21}$ by $s_1$ means adding one box to the Young diagram of $(21)$ and summing the Schur functions indexed by the resulting partitions. The diagram of $(21)$ has row lengths $2$ and $1$. Adding the new box to the first row gives row lengths
\begin{align*}
(2,1)+(1,0)=(3,1),
\end{align*}
so this contributes $s_{31}$. Adding the new box to the second row gives
\begin{align*}
(2,1)+(0,1)=(2,2),
\end{align*}
so this contributes $s_{22}$. Adding the new box as a third row gives
\begin{align*}
(2,1,0)+(0,0,1)=(2,1,1),
\end{align*}
so this contributes $s_{211}$. These are the only possible additions that leave a partition diagram, and each is obtained from exactly one added box. Therefore
\begin{align*}
s_{21}s_1=s_{31}+s_{22}+s_{211}.
\end{align*}
This is the smallest product in the chapter where a Schur product splits into three distinct Schur summands.
[/example]
The Littlewood-Richardson product rule is the algebraic source of positivity; the skew shapes introduced before it give the geometric source. They make the same coefficients visible from the perspective of shape difference.
## Skew Schur Functions and Cauchy Identities
A product $s_\lambda s_\mu$ asks how two straight shapes combine. The skew version asks the inverse question: if a large shape $\nu$ contains a smaller shape $\lambda$, how does the remaining skew diagram decompose into straight Schur functions?
[definition: Skew Schur Function]
Let $\lambda \subset \nu$ be partitions. The skew Schur function $s_{\nu/\lambda}$ is the generating function
\begin{align*}
s_{\nu/\lambda}=\sum_T x^T,
\end{align*}
where the sum is over semistandard tableaux $T$ of skew shape $\nu/\lambda$.
[/definition]
The skew Schur function is symmetric, although this is not immediate from the definition. Once symmetry is known, $s_{\nu/\lambda}$ lies in the homogeneous component $\operatorname{Sym}_{|\nu|-|\lambda|}$, so it can be expanded in the Schur basis of that degree. The central question is whether those coefficients remember the same product constants introduced above. The following theorem answers this by using the Hall inner product to turn multiplication into skewing.
[quotetheorem:5208]
[citeproof:5208]
The containment hypothesis is indispensable: if $\lambda \nsubseteq \nu$, the expression $\nu/\lambda$ is not a skew Young diagram and the tableau generating function has no intended meaning. The formula also does not assert that every Schur function of degree $|\nu|-|\lambda|$ appears; the coefficient is zero unless the Littlewood-Richardson conditions can be met. In the special case $\lambda=(1)$, the coefficient $c_{1\mu}^{\nu}$ is governed by the one-box Pieri rule: it is nonzero exactly when $\nu$ is obtained from $\mu$ by adding one box. Thus for $\nu=(32)$ the only possible $\mu$ are the partitions obtained by deleting a removable corner from $(32)$, namely $(31)$ and $(22)$, not every partition of $4$. Thus the theorem gives a second route to nonnegativity, but also a filtering rule: skew Schur functions are Schur positive because their coefficients are Littlewood-Richardson tableau counts, and those counts can vanish for structural reasons.
[example: A Ribbon Skew Shape]
Let $\nu=(32)$ and $\lambda=(1)$. The skew diagram $\nu/\lambda$ has
\begin{align*}
|\nu|-|\lambda|=(3+2)-1=4
\end{align*}
boxes, so its Schur expansion is in the degree-$4$ Schur basis. By *Skew Schur Expansion*,
\begin{align*}
s_{32/1}=\sum_{\mu\vdash 4} c_{1\mu}^{32}s_\mu.
\end{align*}
Thus the coefficient of $s_\mu$ is $c_{1\mu}^{32}$, not $c_{\mu,1}^{32}$ by notation alone; we must ask when $s_1s_\mu$ contains $s_{32}$.
By the one-box Pieri rule, $s_1s_\mu$ is the sum of Schur functions obtained by adding one box to the Young diagram of $\mu$. Since the partitions of $4$ are
\begin{align*}
(4),\qquad (31),\qquad (22),\qquad (211),\qquad (1111),
\end{align*}
we test which of them can become $(32)$ after one box is added. Deleting a removable corner from $(32)$ gives exactly the possible predecessors. Removing the last box in the first row gives
\begin{align*}
(3,2)-(1,0)=(2,2),
\end{align*}
and removing the last box in the second row gives
\begin{align*}
(3,2)-(0,1)=(3,1).
\end{align*}
There are no other removable corners in $(32)$, because only the final box of a row whose removal still leaves a partition diagram may be removed.
Therefore
\begin{align*}
c_{1,22}^{32}=1,\qquad c_{1,31}^{32}=1,
\end{align*}
and
\begin{align*}
c_{1,4}^{32}=c_{1,211}^{32}=c_{1,1111}^{32}=0.
\end{align*}
Substituting these coefficients into the skew expansion gives
\begin{align*}
s_{32/1}
&=c_{1,4}^{32}s_4+c_{1,31}^{32}s_{31}+c_{1,22}^{32}s_{22}
+c_{1,211}^{32}s_{211}+c_{1,1111}^{32}s_{1111}\\
&=0s_4+1s_{31}+1s_{22}+0s_{211}+0s_{1111}\\
&=s_{31}+s_{22}.
\end{align*}
In particular, $(211)$ does not occur because adding one box to $(211)$ can produce $(311)$, $(221)$, or $(2111)$, but not $(32)$.
[/example]
Cauchy identities supply a third positivity mechanism: they express a large symmetric kernel as a sum of positive Schur terms simultaneously in two alphabets. The kernel initially expands as a product of geometric series, which counts nonnegative integer matrices. To see Schur functions appear with positive coefficients, we need a bijection from those matrices to pairs of tableaux of a common shape.
[quotetheorem:5202]
[citeproof:5202]
The product side requires paired alphabets and a formal power-series interpretation; substituting arbitrary numbers without convergence control can make the infinite product meaningless. The identity also does not say that a single factor $\prod_j(1-x_i y_j)^{-1}$ is Schur positive in the $x$ variables, because Schur functions arise only after all matrix entries are organised by RSK into pairs of tableaux of a common shape. Computationally, the identity lets coefficients of the kernel be reorganised by shape: instead of counting matrices entry by entry, one counts pairs of tableaux with common shape and prescribed contents. Structurally, this is the model for later arguments in which a bilinear generating function reveals an orthonormal basis and converts coefficient extraction into an inner-product calculation. It is also the conceptual reason that Schur functions are self-dual for the Hall inner product, which is why the previous skew-expansion argument could read coefficients using scalar products.
[remark: Sources Of Schur Positivity]
In this chapter, Schur-positive expansions arise in three related ways: semistandard tableaux give Kostka nonnegativity, Littlewood-Richardson tableaux give product nonnegativity, and RSK gives the positive Schur expansion of the Cauchy kernel. Later representation-theoretic interpretations identify the same coefficients as multiplicities of irreducible polynomial representations, but the present course obtains them combinatorially.
[/remark]
The practical lesson is that Schur positivity should usually be proved by finding a positive model for the coefficients, not by expanding determinants and hoping cancellations resolve favourably. The transition matrices are still essential for examples and counterexamples, because they allow small cases to be checked exactly.
## Beyond These Notes
Symmetric functions sit at the meeting point of algebra, enumeration, and geometry. The representation-theoretic path identifies Schur functions with characters of polynomial representations and interprets Littlewood-Richardson coefficients as tensor-product multiplicities. The geometric path sends the same coefficients into Schubert calculus, where partitions index Schubert classes and Pieri-type rules describe products in cohomology rings. A third path deforms the bases studied here into Hall-Littlewood, Macdonald, and related families, where tableaux remain visible but the coefficients remember extra parameters.
Several internal Androma notes give useful background for those directions. [Cambridge II Representation Theory](/page/Cambridge%20II%20Representation%20Theory) supplies the character-theoretic setting behind Schur functions and multiplicities. [Algebraic Geometry Overview](/page/Algebraic%20Geometry%20Overview) and [Cambridge II Algebraic Geometry](/page/Cambridge%20II%20Algebraic%20Geometry) provide the geometric language that later turns partitions into Schubert varieties and intersection classes. [Cambridge III Combinatorics](/page/Cambridge%20III%20Combinatorics) is the natural companion for the enumerative and bijective methods used in tableau arguments.
## References
- [Cambridge II Representation Theory](/page/Cambridge%20II%20Representation%20Theory)
- [Cambridge III Combinatorics](/page/Cambridge%20III%20Combinatorics)
- [Algebraic Geometry Overview](/page/Algebraic%20Geometry%20Overview)
- [Cambridge II Algebraic Geometry](/page/Cambridge%20II%20Algebraic%20Geometry)
Contents
- Introduction
- What Problem Symmetric Functions Solve
- The Stable Ring And Its Bases
- Generating Functions As Algebraic Devices
- Schur Functions And Tableaux
- Product Rules And The Aim Of The Course
- How To Read These Notes
- 1. Symmetric Polynomials and the Stable Ring
- Symmetric Polynomials in Finitely Many Variables
- Partitions And Young Diagrams
- The Stable Ring Of Symmetric Functions
- Grading, Completion, And Infinite Identities
- 2. Classical Bases of Symmetric Functions
- Monomial Symmetric Functions
- Elementary and Complete Homogeneous Functions
- Power Sums and Newton Identities
- Degree Four Transition Matrices
- 3. Inner Products, Involutions, and Duality
- The Hall Inner Product from Power Sums
- Complete and Monomial Duality
- Adjoint Operators in Small Degrees
- The Omega Involution
- 4. Young Tableaux and Schur Functions
- Semistandard Tableaux and Weights
- Schur Functions as Tableau Generating Functions
- Kostka Numbers and Dominance Triangularity
- The Schur Basis of Symmetric Functions
- 5. Determinantal Formulae
- Alternants and the Bialternant Formula
- Complete Homogeneous Determinants
- Elementary Determinants and Conjugate Shapes
- Comparing the Three Determinantal Formulae
- 6. Cauchy Identities and Kernels
- The Cauchy Kernel Over Two Alphabets
- Dual Bases And The Schur Cauchy Identity
- The Omega Involution And The Dual Cauchy Identity
- 7. Pieri Rules and Skew Schur Functions
- Horizontal and Vertical Strips
- Pieri Rules for Complete and Elementary Multiplication
- Skew Schur Functions and Skew Jacobi-Trudi Formulae
- 8. Littlewood-Richardson Rule
- Schur Products and Structure Constants
- Reading Words and Lattice Conditions
- The Littlewood-Richardson Rule
- Skew Schur Functions
- Symmetry and Commutativity
- How to Compute with the Rule
- 9. Specializations and Enumerative Consequences
- Principal Specialization
- Hook Contents and Bounded Entries
- Standard Tableaux and the Hook Length Formula
- Consequences for Schur Enumeration
- 10. Operators and Generating Methods
- Multiplication and Adjoint Operators
- Skewing Operators and Skew Schur Functions
- The Cauchy Kernel as a Reproducing Kernel
- Plethystic Notation and Alphabets
- 11. Transition Coefficients and Positivity Phenomena
- Kostka Numbers and Dominance Triangularity
- Schur Positivity Without Hall-Littlewood Theory
- Products and Littlewood-Richardson Coefficients
- Skew Schur Functions and Cauchy Identities
- Beyond These Notes
- References
Algebraic Combinatorics I: Symmetric Functions
Content
Problems
History
Created by admin on 6/4/2026 | Last updated on 6/5/2026
Prerequisites (0/1 completed)
Log in to track your prerequisite progress.
Prerequisites Graph
Interactive dependency map showing prerequisite concepts
Loading dependency graph...
Theorem
Definition
Current
Requires
Rate this page
★
★
★
★
★
Poor
Excellent